GUI Agents Papers
Star · 751

AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement

Siqi Pei, Liang Tang, Tiaonan Duan, Long Chen, Shuxian Li, Kaer Huang, Yanzhe Jing, Yiqiang Yan, Bo Zhang, Chenghao Jiang, Borui Zhang, Jiwen Lu

🏛 Institutions
Lenovo Research
📅 Date
March 18, 2026
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

AdaZoom-GUI targets two concrete GUI-grounding bottlenecks: ambiguous natural-language instructions and tiny UI elements in high-resolution screenshots. It combines instruction rewriting with a conditional second-stage zoom-in pass and reports state-of-the-art grounding performance among comparable model sizes.

Open paper arXiv Edit on GitHub Report issue
Related papers