GUI Agents Papers
Star · 821

TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for Generalized GUI Agents

Bofei Zhang , Zirui Shang , Zhi Gao , Wang Zhang , Rui Xie , Xiaojian Ma , Tao Yuan , Xinxiao Wu , Song-Chun Zhu , Qing Li

🏛 Institutions
State Key Laboratory of General Artificial Intelligence , BIGAI , Beijing Institute of Technology , PKU , SJTU , Tsinghua
📅 Date
April 17, 2025
📑 Publisher
AAAI 2026
💻 Env
General GUI
🔑 Keywords
TLDR

TongUI turns multimodal web tutorials into large-scale GUI-agent training trajectories by crawling and processing tutorial videos and articles. The resulting GUI-Net dataset spans 143K trajectories across five operating systems and more than 200 applications, and fine-tuning on it improves generalized GUI-agent performance.

Open paper arXiv Report issue
Related papers (24)