MobileDreamer: Generative Sketch World Model for GUI Agent
Yilin Cao, Yufeng Zhong, Zhixiong Zeng, Liming Zheng, Jing Huang, Haibo Qiu, Peng Shi, Wenji Mao, Wan Guanglu

TL;DR
MobileDreamer introduces an efficient world model for GUI agents that enables better long-term decision making by forecasting future states through textual sketches, significantly improving task success rates.
Contribution
It presents a novel textual sketch world model and rollout imagination strategy that enhance GUI agent performance by enabling spatially aware, long-horizon planning.
Findings
Achieves state-of-the-art performance on Android World.
Improves task success rate by 5.25%.
Accurately forecasts key GUI elements with textual sketches.
Abstract
Mobile GUI agents have shown strong potential in real-world automation and practical applications. However, most existing agents remain reactive, making decisions mainly from current screen, which limits their performance on long-horizon tasks. Building a world model from repeated interactions enables forecasting action outcomes and supports better decision making for mobile GUI agents. This is challenging because the model must predict post-action states with spatial awareness while remaining efficient enough for practical deployment. In this paper, we propose MobileDreamer, an efficient world-model-based lookahead framework to equip the GUI agents based on the future imagination provided by the world model. It consists of textual sketch world model and rollout imagination for GUI agent. Textual sketch world model forecasts post-action states through a learning process to transform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI · Artificial Intelligence in Games · Multimodal Machine Learning Applications
