AgentCPM‑GUI: Building Mobile‑Use Agents with Reinforcement Fine‑Tuning
Zhong Zhang, Yaxi Lu, Yikun Fu, Yupeng Huo, Shenzhi Yang, Yesai Wu, Han Si, Xin Cong, Haotian Chen, Yankai Lin, Jie Xie, Wei Zhou, Wang Xu, Yuanheng Zhang, Zhou Su, Zhongwu Zhai, Xiaoming Liu, Yudong Mei, Jianming Xu, Hongyan Tian, Chongyi Wang, Chi Chen, Yuan Yao, Zhiyuan Liu, Maosong Sun
- 🏛 Institutions
- Tsinghua, Renmin University of China, ModelBest
- 📅 Date
- June 2, 2025
- 📑 Publisher
- EMNLP 2025 System Demonstrations
- 💻 Env
- Mobile
- 🔑 Keywords
TLDR
AgentCPM-GUI is an 8B mobile GUI model aimed at robust on-device interaction, especially for Chinese and English interfaces. It combines grounding-aware pre-training, supervised trajectory imitation, and GRPO-based reinforcement fine-tuning, and reports strong results on five public benchmarks plus the newly proposed Chinese benchmark CAGUI.
Related papers
- Don't Act Blindly: Robust GUI Automation via Action-Effect Verification and Self-CorrectionApril 7, 2026 · ACL 2026
- Generalization in Online Reinforcement Learning for Mobile AgentsMarch 8, 2026 · arXiv
- AutoWebGLM: A Large Language Model-based Web Navigating AgentApril 4, 2024 · KDD 2024
- ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI AgentsApril 13, 2026 · arXiv
- SecAgent: Efficient Mobile GUI Agent with Semantic ContextMarch 9, 2026 · arXiv
- OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task ExecutionJanuary 28, 2026 · arXiv