ProgRM: Build Better GUI Agents with Progress Rewards
Danyang Zhang, Situo Zhang, Ziyue Yang, Zichen Zhu, Zihan Zhao, Ruisheng Cao, Lu Chen, Kai Yu
- 🏛 Institutions
- SJTU AI Institute, SJTU, Jiangsu Key Lab of Language Computing, Suzhou Laboratory
- 📅 Date
- May 23, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- General GUI
- 🔑 Keywords
TLDR
ProgRM studies how to replace coarse outcome-only rewards with dense step-level progress signals for GUI-agent reinforcement learning. It uses an LCS-based self-annotation method to assign progress labels from successful trajectories and shows that progress rewards outperform outcome reward models across GUI benchmarks.
Related papers
- OS-Themis: A Scalable Critic Framework for Generalist GUI RewardsMarch 19, 2026 · arXiv
- ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI AgentsApril 13, 2026 · arXiv
- Android Coach: Improve Online Agentic Training Efficiency with Single State Multiple ActionsApril 8, 2026 · arXiv
- CGL: Advancing Continual GUI Learning via Reinforcement Fine-TuningMarch 3, 2026 · arXiv
- GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RLFebruary 25, 2026 · arXiv
- Building Autonomous GUI Navigation via Agentic-Q Estimation and Step-Wise Policy OptimizationFebruary 14, 2026 · arXiv