GUI Agents Papers
Star · 821

Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining

Weimin Xiong , Shuhao Gu , Bowen Ye , Zihao Yue , Lei Li , Feifan Song , Sujian Li , Hao Tian

🏛 Institutions
Unknown
📅 Date
May 14, 2026
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

Video2GUI is an automated pipeline for extracting grounded GUI interaction trajectories from unlabeled Internet videos. It constructs WildGUI, a dataset of 12 million interaction trajectories across more than 1,500 applications and websites, and uses it for generalized GUI-agent pretraining.

Open paper arXiv Report issue
Related papers (24)