GUI Agents Papers
Star · 751

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Zhiyong Wu, Zhenyu Wu, Fangzhi Xu, Yian Wang, Qiushi Sun, Chengyou Jia, Kanzhi Cheng, Zichen Ding, Liheng Chen, Paul Pu Liang, Yu Qiao

🏛 Institutions
Shanghai AI Laboratory, SJTU, HKU, MIT
📅 Date
October 30, 2024
📑 Publisher
ICLR 2025 (Spotlight)
💻 Env
Desktop Mobile Web
🔑 Keywords
TLDR

OS-Atlas is a foundation action model for GUI agents built on a multi-platform grounding-data synthesis toolkit and a corpus with more than 13 million GUI elements. It improves GUI grounding and zero-shot out-of-distribution agent performance across desktop, mobile, and web benchmarks.

Open paper arXiv Edit on GitHub Report issue
Related papers