Mobile-Agent-v3: Fundamental Agents for GUI Automation
Jiabo Ye, Xi Zhang, Haiyang Xu, Haowei Liu, Junyang Wang, Zhaoqing Zhu, Ziwei Zheng, Feiyu Gao, Junjie Cao, Zhengxi Lu, Jitong Liao, Qi Zheng, Fei Huang, Jingren Zhou, Ming Yan
- 🏛 Institutions
- Tongyi Lab, Alibaba Group
- 📅 Date
- August 21, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- Desktop Mobile
- 🔑 Keywords
TLDR
This paper introduces GUI-Owl as a foundation model for GUI automation and builds Mobile-Agent-v3 as a multi-agent framework on top of it. The work combines cross-OS trajectory production, diverse GUI data synthesis, reasoning enhancement, and trajectory-aware RL, and reports stronger open-source results on both AndroidWorld and OSWorld.
Related papers
- Mobile-Agent-v3.5: Multi-platform Fundamental GUI AgentsFebruary 15, 2026 · arXiv
- OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task ExecutionJanuary 28, 2026 · arXiv
- MAI-UI Technical Report: Real-World Centric Foundation GUI AgentsDecember 26, 2025 · arXiv
- Step-GUI Technical ReportDecember 17, 2025 · arXiv
- ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform DataSeptember 18, 2025 · ICLR 2026 (Oral)
- UI-Venus Technical Report: Building High-performance UI Agents with RFTAugust 14, 2025 · arXiv