GUI Agents Papers
Star · 751

OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution

Le Zhang, Yixiong Xiao, Xinjiang Lu, Jingjia Cao, Yusai Zhao, Jingbo Zhou, Lang An, Zikan Feng, Wanxiang Sha, Yu Shi, Congxi Xiao, Jian Xiong, Yankai Zhang, Hua Wu, Haifeng Wang

🏛 Institutions
Baidu Frontier Research Department
📅 Date
January 28, 2026
📑 Publisher
arXiv
💻 Env
Desktop Mobile
🔑 Keywords
TLDR

OmegaUse is a general-purpose GUI agent for both phone-use and computer-use settings trained with a curated-plus-synthetic data pipeline and a two-stage SFT-then-GRPO recipe on an MoE backbone. It also introduces the OS-Nav suite and reports strong cross-terminal results on ScreenSpot-v2, AndroidControl, ChiM-Nav, and Ubu-Nav.

Open paper arXiv Edit on GitHub Report issue
Related papers