Mind2Web: Towards a Generalist Agent for the Web

Xiang Deng , Yu Gu , Boyuan Zheng , Shijie Chen , Sam Stevens , Boshi Wang , Huan Sun , Yu Su

🏛 Institutions: OSU
📅 Date: June 9, 2023
📑 Publisher: NeurIPS 2023 Datasets and Benchmarks Track
💻 Env: Web
🔑 Keywords: dataset benchmark Mind2Web MindAct generalist web agents

TLDR

Introduces Mind2Web, a benchmark of realistic language-guided web tasks across 137 websites and 31 domains. The companion MindAct framework uses smaller models for element ranking to help larger models operate on messy real-world HTML, making the paper a cornerstone for modern web-agent evaluation.

Open paper Report issue

Related papers (24)

WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent Benchmark

April 13, 2026 · arXiv
WebArena-Infinity: Generating Browser Environments with Verifiable Tasks at Scale

March 2026 · Blog Post
WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces

March 5, 2026 · arXiv
Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents

August 3, 2025 · ICLR 2026 (Poster)
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

May 21, 2025 · NeurIPS 2025 (Spotlight)
RealWebAssist: A Benchmark for Long-Horizon Web Assistance with Real-World Users

April 14, 2025 · AAAI 2026
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

April 11, 2025 · COLM 2025
From Grounding to Planning: Benchmarking Bottlenecks in Web Agents

September 3, 2024 · ECAI 2025
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks

July 7, 2024 · NeurIPS 2024 Datasets and Benchmarks Track (Poster)
GUI Action Narrator: Where and When Did That Action Take Place?

June 19, 2024 · arXiv
WebCanvas: Benchmarking Web Agents in Online Environments

June 18, 2024 · arXiv
GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding

June 16, 2024 · ICLR 2025 (Poster)
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web

February 29, 2024 · ECCV 2024 (Poster)
On the Multi-turn Instruction Following for Conversational Web Agents

February 23, 2024 · ACL 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue

February 8, 2024 · ICML 2024
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents

January 17, 2024 · ACL 2024
WebVLN: Vision-and-Language Navigation on Websites

December 25, 2023 · AAAI 2024
CogAgent: A Visual Language Model for GUI Agents

December 14, 2023 · CVPR 2024 (Highlight)
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents

July 31, 2022 · NeurIPS 2022
Grounding Open-Domain Instructions to Automate Web Support Tasks

March 30, 2021 · NAACL 2021
WebSRC: A Dataset for Web-Based Structural Reading Comprehension

January 23, 2021 · EMNLP 2021
Gym-Anything: Turn any Software into an Agent Environment

April 7, 2026 · arXiv
PSPA-Bench: A Personalized Benchmark for Smartphone GUI Agent

March 31, 2026 · arXiv
SecAgent: Efficient Mobile GUI Agent with Semantic Context

March 9, 2026 · arXiv