Agentic Test-Time Scaling for WebAgents
Nicholas Lee, Lutfi Eren Erdogan, Chris Joseph John, Surya Krishnapillai, Michael W. Mahoney, Kurt Keutzer, Amir Gholami
- 🏛 Institutions
- UC Berkeley, ICSI, LBNL
- 📅 Date
- February 12, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Web
- 🔑 Keywords
TLDR
CATTS dynamically allocates test-time compute for multi-step web agents by using vote-based uncertainty signals to invoke an LLM arbiter only on contentious decisions. It improves performance on WebArena-Lite and GoBrowse while using fewer tokens than uniform scaling.
Related papers
- WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement LearningMay 22, 2025 · EMNLP 2025 (Poster)
- WebATLAS: An LLM Agent with Experience-Driven Memory and Action SimulationOctober 26, 2025 · NeurIPS 2025 Workshop on Language Agents and World Models
- JEF-Hinter: Leveraging Offline Knowledge for Improving Web Agents AdaptationOctober 5, 2025 · arXiv
- Test‑Time Reinforcement Learning for GUI Grounding via Region ConsistencyAugust 7, 2025 · AAAI 2026
- ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and SearchMay 21, 2025 · arXiv
- GAIA: A Data Flywheel System for Training GUI Test-Time Scaling Critic ModelsJanuary 26, 2026 · arXiv