MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research
Dingbang Wu , Rui Hao , Haiyang Wang , Shuzhe Wu , Han Xiao , Zhenghong Li , Bojiang Zhou , Zheng Ju , Zichen Liu , Lue Fan , Zhaoxiang Zhang
- 🏛 Institutions
- CASIA , PKU , CUHK
- 📅 Date
- May 25, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Mobile
- 🔑 Keywords
TLDR
MobileGym is a browser-hosted Android-like simulation platform for mobile GUI agent research that represents full environment state as structured JSON, enabling deterministic state-based judging, snapshot/reset/fork, side-effect detection, and highly parallel rollouts for online RL. Its MobileGym-Bench provides 416 parameterized task templates over 28 apps, and GRPO training on Qwen3-VL-4B-Instruct improves a 256-task test set by +12.8 points with 95.1% sim-to-real gain retention on a real-device subset.
Related papers (24)
- ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI AgentsApril 13, 2026 · arXiv
- Don't Act Blindly: Robust GUI Automation via Action-Effect Verification and Self-CorrectionApril 7, 2026 · ACL 2026
- Generalization in Online Reinforcement Learning for Mobile AgentsMarch 8, 2026 · arXiv
- GUITester: Enabling GUI Agents for Exploratory Defect DiscoveryJanuary 8, 2026 · arXiv
- AgentCPM‑GUI: Building Mobile‑Use Agents with Reinforcement Fine‑TuningJune 2, 2025 · EMNLP 2025 System Demonstrations
- GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI AgentMay 22, 2025 · ACL 2025
- LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration BenchmarkApril 18, 2025 · arXiv
- DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control AgentsOctober 18, 2024 · ICLR 2025 (Poster)
- You Only Look at Screens: Multimodal Chain-of-Action AgentsSeptember 20, 2023 · Findings of ACL 2024
- AutoDroid: LLM-powered Task Automation in AndroidAugust 29, 2023 · MobiCom 2024
- WebArena-Infinity: Generating Browser Environments with Verifiable Tasks at ScaleMarch 2026 · Blog Post
- LongHorizonUI: A Unified Framework for Robust long-horizon Task Automation of GUI AgentJanuary 26, 2026 · ICLR 2026 (Poster)
- WebGym: Scaling Training Environments for Visual Web Agents with Realistic TasksJanuary 5, 2026 · arXiv
- WorldGUI: An Interactive Benchmark for Desktop GUI Automation from Any Starting PointFebruary 12, 2025 · arXiv
- WebWalker: Benchmarking LLMs in Web TraversalJanuary 13, 2025 · arXiv
- Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet AgentsDecember 17, 2024 · ICML 2025 (Poster)
- The BrowserGym Ecosystem for Web Agent ResearchDecember 6, 2024 · TMLR
- AutoWebGLM: A Large Language Model-based Web Navigating AgentApril 4, 2024 · KDD 2024
- SheetCopilot: Bringing Software Productivity to the Next Level through Large Language ModelsMay 30, 2023 · NeurIPS 2023
- Grounding Open-Domain Instructions to Automate Web Support TasksMarch 30, 2021 · NAACL 2021
- Reinforcement Learning on Web Interfaces Using Workflow-Guided ExplorationFebruary 24, 2018 · ICLR 2018 (Poster)
- Benchmarking Living-Screen-Native GUI Agents on Short-Video PlatformsJune 3, 2026 · arXiv
- AndroidDaily: A Verifiable Benchmark for Mobile GUI Agents on Real-World Closed-Source ApplicationsMay 26, 2026 · arXiv
- SimuWoB: Simulating Real-World Mobile Apps for Fast and Faithful GUI Agent BenchmarkingMay 24, 2026 · arXiv