KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation
Tongbo Chen, Zhengxi Lu, Zhan Xu, Guocheng Shao, Shaohan Zhao, Fei Tang, Yong Du, Kaitao Song, Yizhou Liu, Yuchen Yan, Wenqi Zhang, Xu Tan, Weiming Lu, Jun Xiao, Yueting Zhuang, Yongliang Shen
- 🏛 Institutions
- ZJU, Apple, Tencent
- 📅 Date
- April 9, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Mobile
- 🔑 Keywords
TLDR
KnowU-Bench is an online benchmark for personalized mobile agents on Android emulation with 42 general, 86 personalized, and 64 proactive tasks. It hides user profiles from the agent and forces genuine preference inference through multi-turn dialogues. Even frontier models fall below 50% under vague instructions requiring preference inference.
Related papers
- Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive AssistantsApril 1, 2026 · arXiv
- PSPA-Bench: A Personalized Benchmark for Smartphone GUI AgentMarch 31, 2026 · arXiv
- PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric RecordsJanuary 14, 2026 · arXiv
- Persona2Web: Benchmarking Personalized Web Agents for Contextual Reasoning with User HistoryFebruary 19, 2026 · arXiv
- Large Language Models Empowered Personalized Web AgentsOctober 22, 2024 · WWW 2025
- CORA: Conformal Risk-Controlled Agents for Safeguarded Mobile GUI AutomationApril 10, 2026 · arXiv