GUI Agents Papers
Star · 821

MCPWorld: A Unified Benchmarking Testbed for API, GUI, and Hybrid Computer Use Agents

Yunhe Yan , Shihe Wang , Jiajun Du , Yexuan Yang , Yuxuan Shan , Qichen Qiu , Xianqing Jia , Xinge Wang , Xin Yuan , Xu Han , Mao Qin , Yinxiao Chen , Chen Peng , Shangguang Wang , Mengwei Xu

🏛 Institutions
Beijing University of Posts and Telecommunications , Pengcheng Laboratory
📅 Date
June 9, 2025
📑 Publisher
arXiv
💻 Env
Desktop
🔑 Keywords
TLDR

MCPWorld is a desktop computer-use benchmark built around white-box applications whose internals can be instrumented and exposed through MCP-style APIs. That setup lets the paper compare API-only, GUI-only, and hybrid agents under a common task suite with deterministic programmatic verification.

Open paper arXiv Report issue
Related papers (24)