GUI Agents Papers
Star · 751

Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI Grounding

Zhiyuan Jiang, Shenghao Xie, Wenyi Li, Wenqiang Zu, Peihang Li, Jiahao Qiu, Siqi Pei, Lei Ma, Tiejun Huang, Mengdi Wang, Shilong Liu

🏛 Institutions
Xi’an Jiaotong University, Princeton, PKU, University of Chinese Academy of Sciences, HKU, Michigan State University
📅 Date
December 5, 2025
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

This paper studies zooming as a test-time prior for GUI grounding and proposes ZoomClick, which decides when to zoom, how far to zoom, and when to return to the original view during localization. It also introduces GUIZoom-Bench and reports stronger grounding results across several mainstream benchmarks.

Open paper arXiv Edit on GitHub Report issue
Related papers