GUI Agents Papers
Star · 821

SSL: Sweet Spot Learning for Differentiated Guidance in Agentic Optimization

Jinyang Wu , Changpeng Yang , Yuhao Shen , Fangzhi Xu , Bolin Ni , Chonghua Liao , Yuchen Liu , Hongzhen Wang , Shuai Nie , Shuai Zhang , Haoran Luo , Jiaming Xu

🏛 Institutions
Tsinghua , Xiaomi , ZJU , NTU , Institute of Automation , CAS
📅 Date
January 30, 2026
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

SSL replaces binary verifier rewards with progressively amplified tiered rewards that distinguish higher- and lower-quality successful trajectories. Across GUI perception, short- and long-horizon planning, and reasoning benchmarks, it improves optimization stability and reaches up to 2.5x better sample efficiency than binary-reward baselines.

Open paper arXiv Report issue
Related papers (24)