Adaptive Milestone Reward for GUI Agents
Congmin Zheng , Xiaoyun Mo , Xinbei Ma , Qiqiang Lin , Yin Zhao , Jiachen Zhu , Xingyu Lou , Jun Wang , Zhaoxiang Wang , Weiwen Liu , Zhuosheng Zhang , Yong Yu , Weinan Zhang
- 🏛 Institutions
- SJTU , OPPO Research Institute
- 📅 Date
- February 12, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Mobile
- 🔑 Keywords
TLDR
ADMIRE is a reinforcement-learning reward design for GUI agents that distills adaptive, verifiable milestones from successful trajectories and pairs them with asymmetric credit assignment. It improves AndroidWorld performance by more than 10 absolute points and transfers to other RL algorithms and environments.
Related papers (24)
- UI-Voyager: A Self-Evolving GUI Agent Learning via Failed ExperienceMarch 25, 2026 · arXiv
- SSL: Sweet Spot Learning for Differentiated Guidance in Agentic OptimizationJanuary 30, 2026 · arXiv
- MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent ResearchMay 25, 2026 · arXiv
- SE-GA: Memory-Augmented Self-Evolution for GUI AgentsMay 16, 2026 · arXiv
- ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI AgentsApril 13, 2026 · arXiv
- Android Coach: Improve Online Agentic Training Efficiency with Single State Multiple ActionsApril 8, 2026 · arXiv
- Don't Act Blindly: Robust GUI Automation via Action-Effect Verification and Self-CorrectionApril 7, 2026 · ACL 2026
- HATS: Hardness-Aware Trajectory Synthesis for GUI AgentsMarch 12, 2026 · CVPR 2026
- Generalization in Online Reinforcement Learning for Mobile AgentsMarch 8, 2026 · arXiv
- UI-Mem: Self-Evolving Experience Memory for Online Reinforcement Learning in Mobile GUI AgentsFebruary 5, 2026 · arXiv
- SmartSnap: Proactive Evidence Seeking for Self-Verifying AgentsDecember 26, 2025 · arXiv
- Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data CurationSeptember 28, 2025 · arXiv
- MobileRL: Online Agentic Reinforcement Learning for Mobile GUI AgentsSeptember 10, 2025 · ICLR 2026 (Poster)
- Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App ControlSeptember 1, 2025 · NeurIPS 2025 (Poster)
- AgentCPM‑GUI: Building Mobile‑Use Agents with Reinforcement Fine‑TuningJune 2, 2025 · EMNLP 2025 System Demonstrations
- ZeroGUI: Automating Online GUI Learning at Zero Human CostMay 29, 2025 · arXiv
- GUI-Shift: Enhancing VLM-Based GUI Agents through Self-supervised Reinforcement LearningMay 18, 2025 · ICLR 2026 (Poster)
- GUI-R1: A Generalist R1-Style Vision-Language Action Model for GUI AgentsApril 14, 2025 · arXiv
- UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement LearningMarch 27, 2025 · arXiv
- Advancing Autonomous VLM Agents via Variational Subgoal-Conditioned Reinforcement LearningFebruary 11, 2025 · arXiv
- AppVLM: A Lightweight Vision Language Model for Online App ControlFebruary 10, 2025 · arXiv
- DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control AgentsOctober 18, 2024 · ICLR 2025 (Poster)
- DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement LearningJune 14, 2024 · NeurIPS 2024 Main Conference Track
- AndroidWorld: A Dynamic Benchmarking Environment for Autonomous AgentsMay 23, 2024 · ICLR 2025 (Poster)