ZeroGUI: Automating Online GUI Learning at Zero Human Cost
Chenyu Yang , Shiqian Su , Shi Liu , Xuan Dong , Yue Yu , Weijie Su , Xuehui Wang , Zhaoyang Liu , Jinguo Zhu , Hao Li , Wenhai Wang , Yu Qiao , Xizhou Zhu , Jifeng Dai
- 🏛 Institutions
- Shanghai AI Laboratory , Tsinghua , SJTU , HKUST , CUHK
- 📅 Date
- May 29, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- Desktop Mobile
- 🔑 Keywords
TLDR
ZeroGUI studies how to train GUI agents online without human labels instead of relying on static offline supervision. It uses VLMs to generate tasks, estimate rewards, and support two-stage online reinforcement learning, improving both desktop and mobile GUI agents on OSWorld and AndroidLab.
Related papers (24)
- Generalization in Online Reinforcement Learning for Mobile AgentsMarch 8, 2026 · arXiv
- Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data CurationSeptember 28, 2025 · arXiv
- GUI-R1: A Generalist R1-Style Vision-Language Action Model for GUI AgentsApril 14, 2025 · arXiv
- MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent ResearchMay 25, 2026 · arXiv
- ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI AgentsApril 13, 2026 · arXiv
- Android Coach: Improve Online Agentic Training Efficiency with Single State Multiple ActionsApril 8, 2026 · arXiv
- Don't Act Blindly: Robust GUI Automation via Action-Effect Verification and Self-CorrectionApril 7, 2026 · ACL 2026
- UI-Voyager: A Self-Evolving GUI Agent Learning via Failed ExperienceMarch 25, 2026 · arXiv
- Adaptive Milestone Reward for GUI AgentsFebruary 12, 2026 · arXiv
- UI-Mem: Self-Evolving Experience Memory for Online Reinforcement Learning in Mobile GUI AgentsFebruary 5, 2026 · arXiv
- SmartSnap: Proactive Evidence Seeking for Self-Verifying AgentsDecember 26, 2025 · arXiv
- ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use AgentsAugust 19, 2025 · ICLR 2026 (Poster)
- DPO Learning with LLMs-Judge Signal for Computer Use AgentsJune 3, 2025 · arXiv
- AgentCPM‑GUI: Building Mobile‑Use Agents with Reinforcement Fine‑TuningJune 2, 2025 · EMNLP 2025 System Demonstrations
- ARPO:End-to-End Policy Optimization for GUI Agents with Experience ReplayMay 22, 2025 · arXiv
- GUI-Shift: Enhancing VLM-Based GUI Agents through Self-supervised Reinforcement LearningMay 18, 2025 · ICLR 2026 (Poster)
- UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement LearningMarch 27, 2025 · arXiv
- Advancing Autonomous VLM Agents via Variational Subgoal-Conditioned Reinforcement LearningFebruary 11, 2025 · arXiv
- DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control AgentsOctober 18, 2024 · ICLR 2025 (Poster)
- DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement LearningJune 14, 2024 · NeurIPS 2024 Main Conference Track
- A Data-Driven Approach for Learning to Control ComputersFebruary 16, 2022 · ICML 2022
- AndroidEnv: A Reinforcement Learning Platform for AndroidMay 27, 2021 · arXiv
- GUI-C²: Coarse-to-Fine GUI Grounding via Difficulty-Aware Reinforcement LearningMay 29, 2026 · arXiv
- LiteGUI: Distilling Compact GUI Agents with Reinforcement LearningMay 8, 2026 · arXiv