WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks

Hao Bai , Alexey Taymanov , Tong Zhang , Aviral Kumar , Spencer Whitehead

🏛 Institutions: Microsoft , UIUC , CMU
📅 Date: January 5, 2026
📑 Publisher: arXiv
💻 Env: Web
🔑 Keywords: benchmark training environment reinforcement learning asynchronous rollouts WebGym

TLDR

WebGym provides a large-scale open training environment for visual web agents with nearly 300,000 rubric-evaluated tasks on realistic websites. It also includes a high-throughput asynchronous rollout system, and agents fine-tuned on WebGym improve from 26.2% to 42.9% on out-of-distribution websites, outperforming GPT-4o and GPT-5-Thinking.

Open paper arXiv Report issue