GUI Agents Papers
Star · 821

Visual Test-time Scaling for GUI Agent Grounding

Tiange Luo , Lajanugen Logeswaran , Justin Johnson , Honglak Lee

🏛 Institutions
University of Michigan , LG AI Research
📅 Date
May 1, 2025
📑 Publisher
ICCV 2025
💻 Env
General GUI
🔑 Keywords
TLDR

This paper frames GUI grounding as an iterative visual search process rather than a single full-screen prediction. Its RegionFocus method repeatedly zooms into promising regions and uses an image-as-map view to expose landmarks and action candidates, improving grounding on ScreenSpot-Pro and WebVoyager without retraining the base model.

Open paper arXiv Report issue
Related papers (24)