UI-Venus Technical Report: Building High-performance UI Agents with RFT

Zhangxuan Gu , Zhengwen Zeng , Zhenyu Xu , Xingran Zhou , Shuheng Shen , Yunfei Liu , Beitong Zhou , Changhua Meng , Tianyu Xia , Weizhi Chen , Yue Wen , Jingya Dou , Fei Tang , Jinzhen Lin , Yulin Liu , Zhenlin Guo , Yichen Gong , Heng Jia , Changlong Gao , Yuan Guo , Yong Deng , Zhenyu Guo , Liang Chen , Weiqiang Wang

🏛 Institutions: Ant Group
📅 Date: August 14, 2025
📑 Publisher: arXiv
💻 Env: Desktop Mobile Web
🔑 Keywords: model reinforcement fine-tuning trajectory history alignment sparse action enhancement Qwen2.5-VL UI-Venus

TLDR

UI-Venus is a screenshot-only UI agent built on Qwen2.5-VL and trained with reinforcement fine-tuning plus data-cleaning pipelines for both grounding and navigation. The report attributes its gains to reward design and a self-evolving history-alignment and sparse-action mechanism, and reports strong results on ScreenSpot benchmarks and AndroidWorld.

Open paper arXiv Report issue