GUI Agents Papers
Star · 751

SSL: Sweet Spot Learning for Differentiated Guidance in Agentic Optimization

Jinyang Wu, Changpeng Yang, Yuhao Shen, Fangzhi Xu, Bolin Ni, Chonghua Liao, Yuchen Liu, Hongzhen Wang, Shuai Nie, Shuai Zhang, Haoran Luo, Jiaming Xu

🏛 Institutions
Tsinghua, Xiaomi, ZJU, NTU, Institute of Automation, CAS
📅 Date
January 30, 2026
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

SSL replaces binary verifier rewards with progressively amplified tiered rewards that distinguish higher- and lower-quality successful trajectories. Across GUI perception, short- and long-horizon planning, and reasoning benchmarks, it improves optimization stability and reaches up to 2.5x better sample efficiency than binary-reward baselines.

Open paper arXiv Edit on GitHub Report issue
Related papers