GUI Agents Papers
Star · 821

AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement

Siqi Pei , Liang Tang , Tiaonan Duan , Long Chen , Shuxian Li , Kaer Huang , Yanzhe Jing , Yiqiang Yan , Bo Zhang , Chenghao Jiang , Borui Zhang , Jiwen Lu

🏛 Institutions
Lenovo Research
📅 Date
March 18, 2026
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

AdaZoom-GUI targets two concrete GUI-grounding bottlenecks: ambiguous natural-language instructions and tiny UI elements in high-resolution screenshots. It combines instruction rewriting with a conditional second-stage zoom-in pass and reports state-of-the-art grounding performance among comparable model sizes.

Open paper arXiv Report issue
Related papers (24)