WebArena-Infinity: Generating Browser Environments with Verifiable Tasks at Scale
Shuyan Zhou
- 🏛 Institutions
- Duke University
- 📅 Date
- March 2026
- 📑 Publisher
- Blog Post
- 💻 Env
- Web
- 🔑 Keywords
TLDR
WebArena-Infinity automates the generation of high-authenticity web environments with verifiable tasks from static artifacts like user manuals, using a multi-agent pipeline of coding and browser-use agents. It produces 10 environments with 1,260 tasks and 2,070 trajectories. Agents achieve notably lower success rates than on manually built benchmarks, suggesting the generated tasks capture meaningful complexity.
Related papers
- WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent BenchmarkApril 13, 2026 · arXiv
- When Users Change Their Mind: Evaluating Interruptible Agents in Long-Horizon Web NavigationApril 1, 2026 · arXiv
- WebFactory: Automated Compression of Foundational Language Intelligence into Grounded Web AgentsMarch 5, 2026 · arXiv
- WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction TracesMarch 5, 2026 · arXiv
- OpAgent: Operator Agent for Web NavigationFebruary 14, 2026 · arXiv
- InfiniteWeb: Scalable Web Environment Synthesis for GUI Agent TrainingJanuary 7, 2026 · arXiv