The BrowserGym Ecosystem for Web Agent Research

Thibault Le Sellier de Chezelles , Maxime Gasse , Alexandre Lacoste , Massimo Caccia , Alexandre Drouin , Léo Boisvert , Megh Thakkar , Tom Marty , Rim Assouel , Sahar Omidi Shayegan , Lawrence Keunho Jang , Xing Han Lù , Ori Yoran , Dehan Kong , Frank F. Xu , Siva Reddy , Graham Neubig , Quentin Cappart , Russ Salakhutdinov , Nicolas Chapados

🏛 Institutions: ServiceNow Research , ServiceNow , Laval University , imean.ai , Microsoft , CMU , Polytechnique Montréal , Université de Montréal
📅 Date: December 6, 2024
📑 Publisher: TMLR
💻 Env: Web
🔑 Keywords: benchmark framework BrowserGym AgentLab evaluation ecosystem

TLDR

BrowserGym is a unified ecosystem for web-agent research that standardizes observation and action spaces while wrapping multiple existing benchmarks under one interface. The paper also introduces AgentLab for agent creation and analysis, and uses the ecosystem to run a large cross-benchmark comparison of six frontier LLMs.

Open paper Report issue