GUI Agents Papers
Star · 821

OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution

Le Zhang , Yixiong Xiao , Xinjiang Lu , Jingjia Cao , Yusai Zhao , Jingbo Zhou , Lang An , Zikan Feng , Wanxiang Sha , Yu Shi , Congxi Xiao , Jian Xiong , Yankai Zhang , Hua Wu , Haifeng Wang

🏛 Institutions
Baidu Frontier Research Department
📅 Date
January 28, 2026
📑 Publisher
arXiv
💻 Env
Desktop Mobile
🔑 Keywords
TLDR

OmegaUse is a general-purpose GUI agent for both phone-use and computer-use settings trained with a curated-plus-synthetic data pipeline and a two-stage SFT-then-GRPO recipe on an MoE backbone. It also introduces the OS-Nav suite and reports strong cross-terminal results on ScreenSpot-v2, AndroidControl, ChiM-Nav, and Ubu-Nav.

Open paper arXiv Report issue
Related papers (24)