GUI Agents Papers
Star · 751

Video-Based Reward Modeling for Computer-Use Agents

Linxin Song, Jieyu Zhang, Huanxin Sheng, Taiwei Shi, Gupta Rahul, Yang Liu, Ranjay Krishna, Jian Kang, Jieyu Zhao

🏛 Institutions
USC, University of Washington, MBZUAI, Amazon AGI
📅 Date
March 10, 2026
📑 Publisher
arXiv
💻 Env
Desktop Mobile
🔑 Keywords
TLDR

This paper studies reward modeling from execution video rather than agent internals, introducing the ExeVR-53k dataset and an execution-video reward model that predicts success from keyframes plus the user instruction. The model scales evaluation across Ubuntu, macOS, Windows, and Android, outperforming strong proprietary models while providing finer temporal attribution.

Open paper arXiv Edit on GitHub Report issue
Related papers