MobileDreamer: Generative Sketch World Model for GUI Agent

Yilin Cao , Yufeng Zhong , Zhixiong Zeng , Liming Zheng , Jing Huang , Haibo Qiu , Peng Shi , Wenji Mao , Wan Guanglu

🏛 Institutions: State Key Laboratory of Multimodal Artificial Intelligence Systems , Institute of Automation , CAS , University of Chinese Academy of Sciences , Meituan
📅 Date: January 7, 2026
📑 Publisher: arXiv
💻 Env: Mobile
🔑 Keywords: world model lookahead rollout imagination MobileDreamer

TLDR

MobileDreamer equips mobile GUI agents with a lightweight world model that predicts task-relevant textual sketches of future interface states instead of full screenshots. It then uses rollout imagination over those predicted futures for action selection, improving AndroidWorld performance by 5.25% and reaching state of the art.

Open paper arXiv Report issue