Agentic Reward Modeling: Verifying GUI Agent via Online Proactive Interaction

Chaoqun Cui , Jing Huang , Shijing Wang , Liming Zheng , Qingchao Kong , Zhixiong Zeng

🏛 Institutions: Institute of Automation , CAS , University of Chinese Academy of Sciences , Meituan , Beijing Jiaotong University
📅 Date: January 31, 2026
📑 Publisher: arXiv
💻 Env: General GUI
🔑 Keywords: reward modeling verification proactive interaction VAGEN OSWorld AndroidWorld

TLDR

VAGEN turns GUI-agent verification into an active interaction problem, using a verifier agent to probe the environment for evidence of task completion instead of relying on passive judgment. This substantially improves verification accuracy on OSWorld and AndroidWorld.

Open paper arXiv Report issue

Related papers (24)

Agent Alpha: Tree Search Unifying Generation, Exploration and Evaluation for Computer-Use Agents

February 3, 2026 · arXiv
BEAP-Agent: Backtrackable Execution and Adaptive Planning for GUI Agents

January 29, 2026 · arXiv
From Off-Policy to On-Policy: Enhancing GUI Agents via Bi-level Expert-to-Policy Assimilation

January 9, 2026 · arXiv
R-WoM: Retrieval-augmented World Model For Computer-use Agents

October 13, 2025 · ICLR 2026 (Poster)
Just Do It!? Computer-Use Agents Exhibit Blind Goal-Directedness

October 2, 2025 · ICLR 2026 (Poster)
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

April 1, 2025 · COLM 2025
SE-GA: Memory-Augmented Self-Evolution for GUI Agents

May 16, 2026 · arXiv
IntentScore: Intent-Conditioned Action Evaluation for Computer-Use Agents

April 6, 2026 · arXiv
The Art of Building Verifiers for Computer Use Agents

April 5, 2026 · arXiv
GUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play Annotation

March 27, 2026 · arXiv
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

March 25, 2026 · arXiv
ContractSkill: Repairable Contract-Based Skills for Multimodal Web Agents

March 20, 2026 · arXiv
HATS: Hardness-Aware Trajectory Synthesis for GUI Agents

March 12, 2026 · CVPR 2026
Adaptive Milestone Reward for GUI Agents

February 12, 2026 · arXiv
ANCHOR: Branch-Point Data Generation for GUI Agents

February 6, 2026 · arXiv
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience

January 22, 2026 · arXiv
CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

January 14, 2026 · arXiv
Watch and Learn: Learning to Use Computers from Online Videos

October 6, 2025 · CVPR 2026
Scaling Agents for Computer Use

October 2, 2025 · arXiv
Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation

September 28, 2025 · arXiv
MobileRL: Online Agentic Reinforcement Learning for Mobile GUI Agents

September 10, 2025 · ICLR 2026 (Poster)
Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control

September 1, 2025 · NeurIPS 2025 (Poster)
ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents

August 19, 2025 · ICLR 2026 (Poster)
Evolving in Tasks: Empowering the Multi-modality Large Language Model as the Computer Use Agent

August 6, 2025 · arXiv