GUI Agents Papers
Star · 751

UltraCUA: A Foundation Model for Computer Use Agents with Hybrid Action

Yuhao Yang, Zhen Yang, Zi-Yi Dou, Anh Nguyen, Keen You, Omar Attia, Andrew Szot, Michael Feng, Ram Ramrakhya, Alexander Toshev, Chao Huang, Yinfei Yang, Zhe Gan

🏛 Institutions
Apple, HKU
📅 Date
October 20, 2025
📑 Publisher
arXiv
💻 Env
Desktop Web
🔑 Keywords
TLDR

UltraCUA bridges low-level GUI actions and higher-level tool use in one computer-use model instead of forcing every task through clicks, typing, and scrolling alone. Its pipeline combines automated tool extraction, synthetic verifiable tasks, supervised fine-tuning, and online RL, and the resulting hybrid-action models improve both OSWorld performance and transfer to WindowsAgentArena.

Open paper arXiv Edit on GitHub Report issue
Related papers