WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?

Alexandre Drouin , Maxime Gasse , Massimo Caccia , Issam H. Laradji , Manuel Del Verme , Tom Marty , David Vazquez , Nicolas Chapados , Alexandre Lacoste

🏛 Institutions: ServiceNow Research , Mila
📅 Date: March 11, 2024
📑 Publisher: ICML 2024
💻 Env: Web
🔑 Keywords: benchmark WorkArena enterprise workflows ServiceNow BrowserGym

TLDR

WorkArena is a remote-hosted benchmark of 33 enterprise knowledge-work tasks built on the ServiceNow platform for browser-based agents. The paper introduces BrowserGym alongside the benchmark and shows that current agents remain well short of reliable task automation, with a clear gap between open and closed models.

Open paper Report issue