GUI Agents Papers
Star · 821

AndroidLens: Long-latency Evaluation with Nested Sub-targets for Android GUI Agents

Yue Cao , Yingyao Wang , Pi Bu , Jingxuan Xing , Wei Jiang , Zekun Zhu , Junpeng Ma , Sashuai Zhou , Tong Lu , Jun Song , Yu Cheng , Yuning Jiang , Bo Zheng

🏛 Institutions
NJU , Alibaba Group , Fudan , ZJU
📅 Date
December 24, 2025
📑 Publisher
arXiv
💻 Env
Mobile
🔑 Keywords
TLDR

AndroidLens evaluates Android GUI agents on 571 long-latency tasks from 38 domains in both Chinese and English settings, with each task decomposed into nested sub-targets. It combines anomaly-preserving static evaluation with milestone-based Average Task Progress, and the paper reports that even the best models remain far from robust on these tasks.

Open paper arXiv Report issue
Related papers (24)