PATHWAYS: Evaluating Investigation and Context Discovery in AI Web Agents

Shifat E. Arman , Syed Nazmus Sakib , Tapodhir Karmakar Taton , Nafiul Haque , Shahrear Bin Amin

🏛 Institutions: Robotics and Mechatronics Engineering , University of Dhaka
📅 Date: February 5, 2026
📑 Publisher: arXiv
💻 Env: Web
🔑 Keywords: benchmark context discovery investigation hallucination PATHWAYS

TLDR

PATHWAYS is a benchmark of 250 multi-step web decision tasks designed to test whether agents can uncover and correctly use hidden contextual information instead of stopping at surface cues. The results show that agents often fail to retrieve decisive hidden evidence, hallucinate investigative reasoning, and struggle to incorporate discovered context into final decisions.

Open paper arXiv Report issue