GUI Agents Papers
Star · 751

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

Xiangru Jian, Shravan Nayak, Kevin Qinghong Lin, Aarash Feizi, Kaixin Li, Patrice Bechard, Spandana Gella, Sai Rajeswar

🏛 Institutions
ServiceNow, University of Waterloo, Mila, Université de Montréal, McGill University, Oxford, NUS
📅 Date
March 25, 2026
📑 Publisher
arXiv
💻 Env
Desktop
🔑 Keywords
TLDR

CUA-Suite is a large-scale desktop-agent data ecosystem centered on continuous expert video rather than sparse screenshots. It combines VideoCUA, UI-Vision, and GroundCUA to provide 55 hours of demonstrations, dense grounding annotations, and evaluation data across 87 professional desktop applications where current foundation action models still fail frequently.

Open paper arXiv Edit on GitHub Report issue
Related papers