ScaleTrack: Scaling and back-tracking Automated GUI Agents
Jing Huang, Zhixiong Zeng, Wenkang Han, Yufeng Zhong, Liming Zheng, Shuai Fu, Jingyuan Chen, Lin Ma
- 🏛 Institutions
- Meituan, ZJU, University of Adelaide
- 📅 Date
- May 1, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- Desktop Mobile Web
- 🔑 Keywords
TLDR
ScaleTrack targets two training bottlenecks in automated GUI agents: weak grounding data coverage and the lack of backtracking behavior during planning. It aggregates GUI samples from heterogeneous sources into a unified grounding corpus and trains agents to predict the next action together with the historical actions that led to the current screen.
Related papers
- VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding TasksDecember 18, 2025 · arXiv
- UI-TARS: Pioneering Automated GUI Interaction with Native AgentsJanuary 21, 2025 · arXiv
- Ponder & Press: Advancing Visual GUI Agent towards General Computer ControlDecember 2, 2024 · Findings of ACL 2025
- Improved GUI Grounding via Iterative NarrowingNovember 18, 2024 · arXiv
- OS-ATLAS: A Foundation Action Model for Generalist GUI AgentsOctober 30, 2024 · ICLR 2025 (Spotlight)
- TinyClick: Single-Turn Agent for Empowering GUI AutomationOctober 9, 2024 · INTERSPEECH 2025