World-Model-Augmented Web Agents with Action Correction
Zhouzhou Shen , Xueyu Hu , Xiyun Li , Tianqing Fang , Juncheng Li , Shengyu Zhang
- 🏛 Institutions
- ZJU , Tencent AI Lab
- 📅 Date
- February 17, 2026
- 📑 Publisher
- arXiv
- 💻 Env
- Web
- 🔑 Keywords
TLDR
WAC augments web agents with a world model for strategic guidance and consequence simulation, plus a judge model for feedback-driven action correction. This combination improves performance on VisualWebArena and Online-Mind2Web over prior methods.
Related papers (24)
- WebWorld: A Large-Scale World Model for Web Agent TrainingFebruary 16, 2026 · arXiv
- DynaWeb: Model-Based Reinforcement Learning of Web AgentsJanuary 29, 2026 · arXiv
- Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web AgentsNovember 10, 2024 · TMLR
- UI-Oceanus: Scaling GUI Agents with Synthetic Environmental DynamicsFebruary 11, 2026 · arXiv
- Code2World: A GUI World Model via Renderable Code GenerationFebruary 10, 2026 · arXiv
- SafePred: A Predictive Guardrail for Computer-Using Agents via World ModelsFebruary 2, 2026 · arXiv
- MobileDreamer: Generative Sketch World Model for GUI AgentJanuary 7, 2026 · arXiv
- MobileWorldBench: Towards Semantic World Modeling For Mobile AgentsDecember 16, 2025 · arXiv
- R-WoM: Retrieval-augmented World Model For Computer-use AgentsOctober 13, 2025 · ICLR 2026 (Poster)
- Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution ApproachMay 22, 2025 · Findings of EMNLP 2025
- MobileExperts: A Dynamic Tool-Enabled Agent Team in Mobile DevicesJuly 4, 2024 · arXiv
- Seeing is Believing: Vision-driven Non-crash Functional Bug Detection for Mobile AppsJuly 3, 2024 · arXiv
- Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent CollaborationJune 3, 2024 · NeurIPS 2024
- GUI Agents for Continual Game GenerationMay 27, 2026 · arXiv
- Odysseys: Benchmarking Web Agents on Realistic Long Horizon TasksApril 27, 2026 · arXiv
- WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent BenchmarkApril 13, 2026 · arXiv
- The Amazing Agent Race: Strong Tool Users, Weak NavigatorsApril 11, 2026 · arXiv
- Same Outcomes, Different Journeys: A Trace-Level Framework for Comparing Human and GUI-Agent Behavior in Production Search SystemsApril 9, 2026 · arXiv
- MolmoWeb: Open Visual Web Agent and Open Data for the Open WebApril 9, 2026 · arXiv
- ClawBench: Can AI Agents Complete Everyday Online Tasks?April 9, 2026 · arXiv
- GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game AgentsApril 8, 2026 · arXiv
- WebSP-Eval: Evaluating Web Agents on Website Security and Privacy TasksApril 7, 2026 · arXiv
- The Art of Building Verifiers for Computer Use AgentsApril 5, 2026 · arXiv
- The Tool Illusion: Rethinking Tool Use in Web AgentsApril 3, 2026 · arXiv