GUI Agents Papers
Star · 821

VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks

Beitong Zhou , Zhexiao Huang , Yuan Guo , Zhangxuan Gu , Tianyu Xia , Zichen Luo , Fei Tang , Dehan Kong , Yanyi Shang , Suling Ou , Zhenlin Guo , Changhua Meng , Shuheng Shen

🏛 Institutions
Venus Team , Ant Group , iMean AI
📅 Date
December 18, 2025
📑 Publisher
arXiv
💻 Env
Desktop Mobile Web
🔑 Keywords
TLDR

VenusBench-GD is a bilingual GUI grounding benchmark spanning mobile, desktop, and web platforms, and organizes evaluation into basic and advanced grounding tasks. The paper uses this hierarchy to show that general-purpose multimodal models have mostly caught up on basic grounding, while advanced tasks still expose substantial reasoning and robustness gaps.

Open paper arXiv Report issue
Related papers (24)