From Grounding to Planning: Benchmarking Bottlenecks in Web Agents

Segev Shlomov, Ben Wiesel, Aviad Sela, Ido Levy, Liane Galanti, Roy Abitbol

🏛 Institutions: IBM
📅 Date: September 3, 2024
📑 Publisher: ECAI 2025
💻 Env: Web
🔑 Keywords: benchmark planning bottleneck grounding bottleneck component-wise evaluation Mind2Web

TLDR

This paper refines Mind2Web into separate planning and grounding benchmarks to diagnose which component is actually limiting web-agent performance. Its analysis argues that planning, not grounding, is the dominant bottleneck, and shows that isolating grounding can already yield near-perfect element accuracy with current techniques.

Open paper Edit on GitHub Report issue