META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI
Liangtai Sun , Xingyu Chen , Lu Chen , Tianle Dai , Zichen Zhu , Kai Yu
- 🏛 Institutions
- SJTU
- 📅 Date
- May 23, 2022
- 📑 Publisher
- EMNLP 2022
- 💻 Env
- Mobile
- 🔑 Keywords
TLDR
META-GUI introduces a GUI-based task-oriented dialogue architecture in which the assistant acts on real mobile apps instead of calling task-specific backend APIs. The paper also releases a dataset for training multimodal conversational agents under that GUI-TOD setup.
Related papers (24)
- BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking MechanismMay 27, 2025 · EMNLP 2025 (Oral)
- ReachAgent: Enhancing Mobile Agent via Page Reaching and OperationApril 30, 2025 · NAACL 2025 (Poster)
- MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task AutomationApril 30, 2025 · NAACL 2025 (System Demonstrations)
- LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration BenchmarkApril 18, 2025 · arXiv
- SheetCopilot: Bringing Software Productivity to the Next Level through Large Language ModelsMay 30, 2023 · NeurIPS 2023
- Grounding Open-Domain Instructions to Automate Web Support TasksMarch 30, 2021 · NAACL 2021
- MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent ResearchMay 25, 2026 · arXiv
- ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI AgentsApril 13, 2026 · arXiv
- PSPA-Bench: A Personalized Benchmark for Smartphone GUI AgentMarch 31, 2026 · arXiv
- Video-Based Reward Modeling for Computer-Use AgentsMarch 10, 2026 · arXiv
- SecAgent: Efficient Mobile GUI Agent with Semantic ContextMarch 9, 2026 · arXiv
- Turing Test on Screen: A Benchmark for Mobile GUI Agent HumanizationFebruary 24, 2026 · arXiv
- AmbiBench: Benchmarking Mobile GUI Agents Beyond One-Shot Instructions in the WildFebruary 12, 2026 · arXiv
- MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic EnvironmentsFebruary 3, 2026 · arXiv
- MAGNET: Towards Adaptive GUI Agents with Memory-Driven Knowledge EvolutionJanuary 27, 2026 · arXiv
- SwipeGen: Bridging the Execution Gap in GUI Agents via Human-like Swipe SynthesisJanuary 26, 2026 · arXiv
- SMAN-Bench: A Cross-System Benchmark for Mobile Agents under Single- and Multi-path, Ambiguous, and Noisy TasksJanuary 26, 2026 · ICLR 2026 (Poster)
- GraphPilot: GUI Task Automation with One-Step LLM Reasoning Powered by Knowledge GraphJanuary 24, 2026 · Journal of Intelligent Computing and Networking
- GUITester: Enabling GUI Agents for Exploratory Defect DiscoveryJanuary 8, 2026 · arXiv
- MobileWorldBench: Towards Semantic World Modeling For Mobile AgentsDecember 16, 2025 · arXiv
- Surfer 2: The Next Generation of Cross-Platform Computer Use AgentsOctober 22, 2025 · arXiv
- CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMsOctober 17, 2025 · NeurIPS 2025 (Poster)
- NaturalGAIA: Pushing the Frontiers of GUI Agents with a Challenging Benchmark and High-Quality Trajectory DatasetAugust 2, 2025 · arXiv
- FingerTip 20K: A Benchmark for Proactive and Personalized Mobile LLM AgentsJune 9, 2025 · ICLR 2026 (Poster)