GUI Agents Papers
Star · 821

MobileDreamer: Generative Sketch World Model for GUI Agent

Yilin Cao , Yufeng Zhong , Zhixiong Zeng , Liming Zheng , Jing Huang , Haibo Qiu , Peng Shi , Wenji Mao , Wan Guanglu

🏛 Institutions
State Key Laboratory of Multimodal Artificial Intelligence Systems , Institute of Automation , CAS , University of Chinese Academy of Sciences , Meituan
📅 Date
January 7, 2026
📑 Publisher
arXiv
💻 Env
Mobile
🔑 Keywords
TLDR

MobileDreamer equips mobile GUI agents with a lightweight world model that predicts task-relevant textual sketches of future interface states instead of full screenshots. It then uses rollout imagination over those predicted futures for action selection, improving AndroidWorld performance by 5.25% and reaching state of the art.

Open paper arXiv Report issue
Related papers (24)