GraphPilot: GUI Task Automation with One-Step LLM Reasoning Powered by Knowledge Graph
Mingxian Yu, Siqi Luo, Xu Chen
- 🏛 Institutions
- Sun Yat-sen University
- 📅 Date
- January 24, 2026
- 📑 Publisher
- Journal of Intelligent Computing and Networking
- 💻 Env
- Mobile
- 🔑 Keywords
TLDR
GraphPilot builds app-specific knowledge graphs that encode page functions, element roles, and transition rules, then uses them to plan nearly complete action sequences in almost one LLM query. On DroidTask it improves task completion while sharply reducing latency and the number of LLM calls relative to stepwise mobile agents.
Related papers
- ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI AgentsApril 13, 2026 · arXiv
- GUITester: Enabling GUI Agents for Exploratory Defect DiscoveryJanuary 8, 2026 · arXiv
- Surfer 2: The Next Generation of Cross-Platform Computer Use AgentsOctober 22, 2025 · arXiv
- CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMsOctober 17, 2025 · NeurIPS 2025 (Poster)
- Agent-SAMA: State-Aware Mobile AssistantMay 29, 2025 · AAAI 2026
- BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking MechanismMay 27, 2025 · EMNLP 2025 (Oral)