GUI Agents Papers
Star · 821

Demo2Tutorial: From Human Experience to Multimodal Software Tutorials

Zechen Bai , Zhiheng Chen , Yiqi Lin , Kevin Qinghong Lin , Difei Gao , Xiangwu Guo , Xin Wang , Mike Zheng Shou

🏛 Institutions
Unknown
📅 Date
June 2, 2026
📑 Publisher
arXiv
💻 Env
General GUI
🔑 Keywords
TLDR

Demo2Tutorial converts screen recordings and interaction logs into structured multimodal software tutorials with parsed actions, intents, and hierarchical task graphs. The paper evaluates tutorial generation quality and shows that the resulting representations improve downstream GUI-agent planning and generalization.

Open paper arXiv Report issue
Related papers (24)