GUI Agents Papers
Star · 821

STEVE: A Step Verification Pipeline for Computer-use Agent Training

Fanbin Lu , Zhisheng Zhong , Ziqin Wei , Shu Liu , Chi-Wing Fu , Jiaya Jia

🏛 Institutions
CUHK , SmartMore , HKUST
📅 Date
March 16, 2025
📑 Publisher
arXiv
💻 Env
Desktop
🔑 Keywords
TLDR

STEVE trains desktop computer-use agents from suboptimal trajectories by verifying each step against before-and-after screenshots instead of relying on expensive gold trajectories. The resulting binary step labels support KTO training of a 7B agent that outperforms supervised fine-tuning on WinAgentArena.

Open paper arXiv Report issue
Related papers (24)