GUI Agents Papers
Star · 821

GUI-360: A Comprehensive Dataset and Benchmark for Computer-Using Agents

Jian Mu , Chaoyun Zhang , Chiming Ni , Lu Wang , Bo Qiao , Kartik Mathur , Qianhui Wu , Yuhang Xie , Xiaojun Ma , Mengyu Zhou , Si Qin , Liqun Li , Yu Kang , Minghua Ma , Qingwei Lin , Saravan Rajmohan , Dongmei Zhang

🏛 Institutions
Microsoft , NJU , ZJU-UIUC , PKU
📅 Date
November 6, 2025
📑 Publisher
arXiv
💻 Env
Desktop
🔑 Keywords
TLDR

GUI-360 addresses the lack of large real-world CUA data and unified evaluation by releasing 1.2M+ executed action steps across thousands of trajectories in popular Windows office applications, including full-resolution screenshots, accessibility metadata, intermediate reasoning, and both successful and failed trajectories. It is the first corpus to jointly cover GUI grounding, screen parsing, action prediction, and API-level actions, exposing cascading failures of off-the-shelf VLMs on heterogeneous layouts.

Open paper arXiv Report issue
Related papers (24)