GUI Agents Papers
Star · 751

GUI-360: A Comprehensive Dataset and Benchmark for Computer-Using Agents

Jian Mu, Chaoyun Zhang, Chiming Ni, Lu Wang, Bo Qiao, Kartik Mathur, Qianhui Wu, Yuhang Xie, Xiaojun Ma, Mengyu Zhou, Si Qin, Liqun Li, Yu Kang, Minghua Ma, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang

🏛 Institutions
Microsoft, NJU, ZJU-UIUC, PKU
📅 Date
November 6, 2025
📑 Publisher
arXiv
💻 Env
Desktop
🔑 Keywords
TLDR

GUI-360 addresses the lack of large real-world CUA data and unified evaluation by releasing 1.2M+ executed action steps across thousands of trajectories in popular Windows office applications, including full-resolution screenshots, accessibility metadata, intermediate reasoning, and both successful and failed trajectories. It is the first corpus to jointly cover GUI grounding, screen parsing, action prediction, and API-level actions, exposing cascading failures of off-the-shelf VLMs on heterogeneous layouts.

Open paper arXiv Edit on GitHub Report issue
Related papers