ReachAgent: Enhancing Mobile Agent via Page Reaching and Operation
Qinzhuo Wu, Wei Liu, Jian Luan, Bin Wang
- 🏛 Institutions
- XiaoMi AI Lab
- 📅 Date
- April 30, 2025
- 📑 Publisher
- NAACL 2025 (Poster)
- 💻 Env
- Mobile
- 🔑 Keywords
TLDR
ReachAgent addresses the tendency of mobile agents to optimize for the next local action while ignoring the larger GUI flow. It introduces the MobileReach training dataset, which decomposes tasks into page-reaching and page-operation subtasks, and uses those subtasks together with reward-based preference GUI flows to train a two-stage mobile agent.
Related papers
- BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking MechanismMay 27, 2025 · EMNLP 2025 (Oral)
- MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task AutomationApril 30, 2025 · NAACL 2025 (System Demonstrations)
- LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration BenchmarkApril 18, 2025 · arXiv
- META-GUI: Towards Multi-modal Conversational Agents on Mobile GUIMay 23, 2022 · EMNLP 2022
- SheetCopilot: Bringing Software Productivity to the Next Level through Large Language ModelsMay 30, 2023 · NeurIPS 2023
- Grounding Open-Domain Instructions to Automate Web Support TasksMarch 30, 2021 · NAACL 2021