WorldGUI: An Interactive Benchmark for Desktop GUI Automation from Any Starting Point
Henry Hengyuan Zhao, Kaiming Yang, Wendi Yu, Difei Gao, Mike Zheng Shou
- 🏛 Institutions
- Show Lab, NUS
- 📅 Date
- February 12, 2025
- 📑 Publisher
- arXiv
- 💻 Env
- Desktop Web
- 🔑 Keywords
TLDR
WorldGUI is a benchmark for evaluating desktop and web GUI agents from diverse non-default starting states instead of only canonical initial setups. The paper also introduces WorldGUI-Agent, a model-agnostic three-stage critique framework that improves adaptation and recovery in those dynamic settings.
Related papers
- WebWalker: Benchmarking LLMs in Web TraversalJanuary 13, 2025 · arXiv
- The BrowserGym Ecosystem for Web Agent ResearchDecember 6, 2024 · TMLR
- Grounding Open-Domain Instructions to Automate Web Support TasksMarch 30, 2021 · NAACL 2021
- LongHorizonUI: A Unified Framework for Robust long-horizon Task Automation of GUI AgentJanuary 26, 2026 · ICLR 2026 (Poster)
- GUITester: Enabling GUI Agents for Exploratory Defect DiscoveryJanuary 8, 2026 · arXiv
- VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding TasksDecember 18, 2025 · arXiv