GUI Agents Papers
Star · 751

GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents

Yuqi Zhou, Sunhao Dai, Shuai Wang, Kaiwen Zhou, Qinglin Jia, Jun Xu

🏛 Institutions
Renmin University of China, Huawei Noah's Ark Lab
📅 Date
May 21, 2025
📑 Publisher
NeurIPS 2025 (Poster)
💻 Env
General GUI
🔑 Keywords
TLDR

This paper analyzes why blindly copying R1-Zero-style online RL pipelines into GUI grounding leads to poor behavior, including overlong reasoning, reward hacking on box size, and under-optimization on hard examples. It then proposes targeted fixes in prompt design, reward shaping, and difficulty-aware policy optimization. The resulting GUI-G1 model sets a new state of the art for its scale on ScreenSpot-style GUI grounding benchmarks.

Open paper arXiv Edit on GitHub Report issue
Related papers