GUI Agents Papers
Star · 751

VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks

Beitong Zhou, Zhexiao Huang, Yuan Guo, Zhangxuan Gu, Tianyu Xia, Zichen Luo, Fei Tang, Dehan Kong, Yanyi Shang, Suling Ou, Zhenlin Guo, Changhua Meng, Shuheng Shen

🏛 Institutions
Venus Team, Ant Group, iMean AI
📅 Date
December 18, 2025
📑 Publisher
arXiv
💻 Env
Desktop Mobile Web
🔑 Keywords
TLDR

VenusBench-GD is a bilingual GUI grounding benchmark spanning mobile, desktop, and web platforms, and organizes evaluation into basic and advanced grounding tasks. The paper uses this hierarchy to show that general-purpose multimodal models have mostly caught up on basic grounding, while advanced tasks still expose substantial reasoning and robustness gaps.

Open paper arXiv Edit on GitHub Report issue
Related papers