GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents

Yuqi Zhou , Sunhao Dai , Shuai Wang , Kaiwen Zhou , Qinglin Jia , Jun Xu

🏛 Institutions: Renmin University of China , Huawei Noah's Ark Lab
📅 Date: May 21, 2025
📑 Publisher: NeurIPS 2025 (Poster)
💻 Env: General GUI
🔑 Keywords: GUI grounding reinforcement learning fast thinking template difficulty-aware scaling GUI-G1

TLDR

This paper analyzes why blindly copying R1-Zero-style online RL pipelines into GUI grounding leads to poor behavior, including overlong reasoning, reward hacking on box size, and under-optimization on hard examples. It then proposes targeted fixes in prompt design, reward shaping, and difficulty-aware policy optimization. The resulting GUI-G1 model sets a new state of the art for its scale on ScreenSpot-style GUI grounding benchmarks.

Open paper arXiv Report issue