WorldGUI: An Interactive Benchmark for Desktop GUI Automation from Any Starting Point

Henry Hengyuan Zhao , Kaiming Yang , Wendi Yu , Difei Gao , Mike Zheng Shou

🏛 Institutions: Show Lab , NUS
📅 Date: February 12, 2025
📑 Publisher: arXiv
💻 Env: Desktop Web
🔑 Keywords: benchmark framework dynamic initial states planning robustness WorldGUI-Agent WorldGUI

TLDR

WorldGUI is a benchmark for evaluating desktop and web GUI agents from diverse non-default starting states instead of only canonical initial setups. The paper also introduces WorldGUI-Agent, a model-agnostic three-stage critique framework that improves adaptation and recovery in those dynamic settings.

Open paper arXiv Report issue