GUI Agents Papers
Star · 751

Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction

Junhong Shen, Hao Bai, Lunjun Zhang, Yifei Zhou, Amrith Setlur, Shengbang Tong, Diego Caples, Nan Jiang, Tong Zhang, Ameet Talwalkar, Aviral Kumar

🏛 Institutions
CMU, Scribe, UIUC, University of Toronto, UC Berkeley, The AGI Company, New York University
📅 Date
June 9, 2025
📑 Publisher
SEA @ NeurIPS 2025 (Oral)
💻 Env
Web
🔑 Keywords
TLDR

This paper argues that interactive web agents benefit more from scaling how long they can interact with the environment than from merely lengthening pre-action reasoning traces. It introduces Test-Time Interaction (TTI), an online RL method that increases rollout horizons and yields stronger WebVoyager and WebArena agents with richer exploration and replanning behavior.

Open paper arXiv Edit on GitHub Report issue
Related papers