World-Model-Augmented Web Agents with Action Correction

Zhouzhou Shen , Xueyu Hu , Xiyun Li , Tianqing Fang , Juncheng Li , Shengyu Zhang

🏛 Institutions: ZJU , Tencent AI Lab
📅 Date: February 17, 2026
📑 Publisher: arXiv
💻 Env: Web
🔑 Keywords: world model action correction judge model consequence simulation WAC multi-agent collaboration

TLDR

WAC augments web agents with a world model for strategic guidance and consequence simulation, plus a judge model for feedback-driven action correction. This combination improves performance on VisualWebArena and Online-Mind2Web over prior methods.

Open paper arXiv Report issue

Related papers (24)

WebWorld: A Large-Scale World Model for Web Agent Training

February 16, 2026 · arXiv
DynaWeb: Model-Based Reinforcement Learning of Web Agents

January 29, 2026 · arXiv
Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents

November 10, 2024 · TMLR
UI-Oceanus: Scaling GUI Agents with Synthetic Environmental Dynamics

February 11, 2026 · arXiv
Code2World: A GUI World Model via Renderable Code Generation

February 10, 2026 · arXiv
SafePred: A Predictive Guardrail for Computer-Using Agents via World Models

February 2, 2026 · arXiv
MobileDreamer: Generative Sketch World Model for GUI Agent

January 7, 2026 · arXiv
MobileWorldBench: Towards Semantic World Modeling For Mobile Agents

December 16, 2025 · arXiv
R-WoM: Retrieval-augmented World Model For Computer-use Agents

October 13, 2025 · ICLR 2026 (Poster)
Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution Approach

May 22, 2025 · Findings of EMNLP 2025
MobileExperts: A Dynamic Tool-Enabled Agent Team in Mobile Devices

July 4, 2024 · arXiv
Seeing is Believing: Vision-driven Non-crash Functional Bug Detection for Mobile Apps

July 3, 2024 · arXiv
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

June 3, 2024 · NeurIPS 2024
GUI Agents for Continual Game Generation

May 27, 2026 · arXiv
Odysseys: Benchmarking Web Agents on Realistic Long Horizon Tasks

April 27, 2026 · arXiv
WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent Benchmark

April 13, 2026 · arXiv
The Amazing Agent Race: Strong Tool Users, Weak Navigators

April 11, 2026 · arXiv
Same Outcomes, Different Journeys: A Trace-Level Framework for Comparing Human and GUI-Agent Behavior in Production Search Systems

April 9, 2026 · arXiv
MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

April 9, 2026 · arXiv
ClawBench: Can AI Agents Complete Everyday Online Tasks?

April 9, 2026 · arXiv
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

April 8, 2026 · arXiv
WebSP-Eval: Evaluating Web Agents on Website Security and Privacy Tasks

April 7, 2026 · arXiv
The Art of Building Verifiers for Computer Use Agents

April 5, 2026 · arXiv
The Tool Illusion: Rethinking Tool Use in Web Agents

April 3, 2026 · arXiv