GUI Agents Papers
Star · 751

WindowsWorld: A Process-Centric Benchmark of Autonomous GUI Agents in Professional Cross-Application Environments

Jinchao Li, Yunxin Li, Chenrui Zhao, Zhenran Xu, Baotian Hu, Min Zhang

🏛 Institutions
HIT-Shenzhen
📅 Date
April 30, 2026
📑 Publisher
arXiv
💻 Env
Desktop
🔑 Keywords
TLDR

WindowsWorld targets the gap that existing GUI benchmarks focus on isolated single-application tasks, presenting a process-centric suite of 181 cross-application desktop tasks (avg 5.0 sub-goals across 17 applications, 78% multi-application). Evaluated computer-use agents fall below 21% success on multi-application tasks, substantially trailing single-application performance and exposing weak workflow-level coordination.

Open paper arXiv Edit on GitHub Report issue
Related papers