GUI Agents Papers
Star · 751

MobileDreamer: Generative Sketch World Model for GUI Agent

Yilin Cao, Yufeng Zhong, Zhixiong Zeng, Liming Zheng, Jing Huang, Haibo Qiu, Peng Shi, Wenji Mao, Wan Guanglu

🏛 Institutions
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, CAS, University of Chinese Academy of Sciences, Meituan
📅 Date
January 7, 2026
📑 Publisher
arXiv
💻 Env
Mobile
🔑 Keywords
TLDR

MobileDreamer equips mobile GUI agents with a lightweight world model that predicts task-relevant textual sketches of future interface states instead of full screenshots. It then uses rollout imagination over those predicted futures for action selection, improving AndroidWorld performance by 5.25% and reaching state of the art.

Open paper arXiv Edit on GitHub Report issue
Related papers