GUI Agents Papers
Star · 751

ARPO:End-to-End Policy Optimization for GUI Agents with Experience Replay

Fanbin Lu, Zhisheng Zhong, Shu Liu, Chi-Wing Fu, Jiaya Jia

🏛 Institutions
CUHK, SmartMore, HKUST
📅 Date
May 22, 2025
📑 Publisher
arXiv
💻 Env
Desktop
🔑 Keywords
TLDR

ARPO studies end-to-end reinforcement learning for GUI agents in long-horizon desktop environments where sparse rewards and rollout cost make optimization difficult. It augments GRPO with replayed successful experience and task selection, establishing a stronger OSWorld training baseline than prior policy-optimization approaches.

Open paper arXiv Edit on GitHub Report issue
Related papers