GUI Agents Papers
Star · 821

VenusBench-Mobile: A Challenging and User-Centric Benchmark for Mobile GUI Agents with Capability Diagnostics

Yichen Gong , Zhuohan Cai , Sunhao Dai , Yuqi Zhou , Zhangxuan Gu , Changhua Meng , Shuheng Shen

🏛 Institutions
Ant Group , RUC
📅 Date
February 6, 2026
📑 Publisher
arXiv
💻 Env
Mobile
🔑 Keywords
TLDR

VenusBench-Mobile addresses the app-centric, task-homogeneous nature of prior mobile-GUI benchmarks with a user-intent-driven task design and a capability-oriented annotation scheme for fine-grained behavior analysis. SOTA mobile GUI agents see large drops relative to existing benchmarks, with failures dominated by perception and memory deficiencies and near-zero success under environment variations, signaling persistent brittleness in realistic conditions.

Open paper arXiv Report issue
Related papers (24)