ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search
Hyunseok Lee, Jeonghoon Kim, Beomjun Kim, Jihoon Tack, Chansong Jo, Jaehong Lee, Cheonbok Park, Sookyo In, Jinwoo Shin, Kang Min Yoo
- 🏛 Institutions
- KAIST, NAVER Cloud
- 📅 Date
- May 21, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- Web
- 🔑 Keywords
TLDR
ReGUIDE improves web GUI grounding under limited data by combining self-generated reasoning, spatially aware criticism, and test-time spatial search. It substantially outperforms baselines while using only a tiny fraction of the training data required by prior web-grounding approaches.
Related papers
- Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI GroundingDecember 5, 2025 · arXiv
- OpeFlo: Automated UX Evaluation via Simulated Human Web Interaction with GUI GroundingFebruary 25, 2026 · arXiv
- WebTestPilot: Agentic End-to-End Web Testing against Natural Language Specification by Inferring Oracles with Symbolized GUI ElementsFebruary 12, 2026 · arXiv
- Agentic Test-Time Scaling for WebAgentsFebruary 12, 2026 · arXiv
- VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding TasksDecember 18, 2025 · arXiv
- Test‑Time Reinforcement Learning for GUI Grounding via Region ConsistencyAugust 7, 2025 · AAAI 2026