GUI Agents Papers
Star · 751

WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks

Léo Boisvert, Megh Thakkar, Maxime Gasse, Massimo Caccia, Thibault Le Sellier De Chezelles, Quentin Cappart, Nicolas Chapados, Alexandre Lacoste, Alexandre Drouin

🏛 Institutions
ServiceNow Research, Mila, Polytechnique Montréal, Chandar Research Lab
📅 Date
July 7, 2024
📑 Publisher
NeurIPS 2024 Datasets and Benchmarks Track (Poster)
💻 Env
Web
🔑 Keywords
TLDR

WorkArena++ is a web benchmark of 682 enterprise knowledge-work tasks built on ServiceNow to stress compositional planning, retrieval, reasoning, and contextual understanding. Besides the benchmark itself, it adds a mechanism for generating thousands of oracle observation-action traces that can be used to fine-tune web agents.

Open paper Edit on GitHub Report issue
Related papers