Dreamland: Controllable World Creation with Simulator and Generative Models
Sicheng Mo, Ziyang Leng, Leon Liu, Weizhen Wang, Honglin He, Bolei Zhou

TL;DR
Dreamland introduces a hybrid framework combining physics-based simulation and large-scale generative models to enable controllable, realistic world creation, improving scene editing and embodied AI training.
Contribution
It proposes a layered world abstraction bridging simulators and generative models, enhancing controllability and realism in synthetic world generation.
Findings
50.8% improved image quality over baselines
17.9% stronger controllability
Effective for training embodied AI agents
Abstract
Large-scale video generative models can synthesize diverse and realistic visual content for dynamic world creation, but they often lack element-wise controllability, hindering their use in editing scenes and training embodied AI agents. We propose Dreamland, a hybrid world generation framework combining the granular control of a physics-based simulator and the photorealistic content output of large-scale pretrained generative models. In particular, we design a layered world abstraction that encodes both pixel-level and object-level semantics and geometry as an intermediate representation to bridge the simulator and the generative model. This approach enhances controllability, minimizes adaptation cost through early alignment with real-world distributions, and supports off-the-shelf use of existing and future pretrained generative models. We further construct a D3Sim dataset to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Human Motion and Animation · Artificial Intelligence in Games
