GUI Agents Papers
Star · 751

The Amazing Agent Race: Strong Tool Users, Weak Navigators

Zae Myung Kim, Dongseok Lee, Jaehyung Kim, Vipul Raheja, Dongyeop Kang

🏛 Institutions
University of Minnesota
📅 Date
April 11, 2026
📑 Publisher
arXiv
💻 Env
Web
🔑 Keywords
TLDR

The Amazing Agent Race introduces 1,400 DAG-puzzle legs that require fork-merge tool chains over Wikipedia, distinguishing navigation from tool-use ability. The best agent reaches only 37.2%, with navigation errors dominating (27-52% of trials) while tool-use errors stay below 17%, revealing a navigation blind spot invisible to linear benchmarks.

Open paper arXiv Edit on GitHub Report issue
Related papers