BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism
Qinzhuo Wu, Pengzhi Gao, Wei Liu, Jian Luan
- 🏛 Institutions
- MiLM Plus, Xiaomi
- 📅 Date
- May 27, 2025
- 📑 Publisher
- EMNLP 2025 (Oral)
- 💻 Env
- Mobile
- 🔑 Keywords
TLDR
BacktrackAgent addresses the lack of error recovery in mobile GUI agents by adding verifier, judger, and reflector modules plus an explicit backtracking mechanism. It also builds training data for judgment and reflection over post-action outcome pages, improving both task success and step accuracy on Mobile3M and Auto-UI.
Related papers
- ReachAgent: Enhancing Mobile Agent via Page Reaching and OperationApril 30, 2025 · NAACL 2025 (Poster)
- MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task AutomationApril 30, 2025 · NAACL 2025 (System Demonstrations)
- LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration BenchmarkApril 18, 2025 · arXiv
- META-GUI: Towards Multi-modal Conversational Agents on Mobile GUIMay 23, 2022 · EMNLP 2022
- LASER: LLM Agent with State-Space Exploration for Web NavigationSeptember 15, 2023 · arXiv
- SheetCopilot: Bringing Software Productivity to the Next Level through Large Language ModelsMay 30, 2023 · NeurIPS 2023