EntWorld: A Holistic Environment and Benchmark for Verifiable Enterprise GUI Agents

Ying Mo , Yu Bai , Dapeng Sun , Yuqian Shi , Yukai Miao , Li Chen , Dan Li

🏛 Institutions: Zhongguancun Laboratory , Tsinghua
📅 Date: January 25, 2026
📑 Publisher: arXiv
💻 Env: Desktop
🔑 Keywords: benchmark environment enterprise workflows schema-grounded task generation SQL verification EntWorld

TLDR

EntWorld introduces a verifiable enterprise-agent environment and a 1,756-task benchmark spanning six business domains such as CRM, ITIL, and ERP. It synthesizes workflows from database schemas and uses SQL-based deterministic verification instead of visual matching, and current top models still trail human performance by a large margin.

Open paper arXiv Report issue