Pre-Trained Video Generative Models as World Simulators
Haoran He, Yang Zhang, Liang Lin, Zhongwen Xu, Ling Pan

TL;DR
This paper introduces Dynamic World Simulation (DWS), a method to convert pre-trained video generative models into controllable world simulators that accurately model dynamic actions and transitions for applications in games and robotics.
Contribution
The paper presents a universal action-conditioned module and a motion-reinforced loss to enable pre-trained models to simulate dynamic worlds with controllable actions.
Findings
DWS improves action controllability in video generation.
The approach enhances dynamic consistency across different model types.
Applications in reinforcement learning show improved sample efficiency.
Abstract
Video generative models pre-trained on large-scale internet datasets have achieved remarkable success, excelling at producing realistic synthetic videos. However, they often generate clips based on static prompts (e.g., text or images), limiting their ability to model interactive and dynamic scenarios. In this paper, we propose Dynamic World Simulation (DWS), a novel approach to transform pre-trained video generative models into controllable world simulators capable of executing specified action trajectories. To achieve precise alignment between conditioned actions and generated visual changes, we introduce a lightweight, universal action-conditioned module that seamlessly integrates into any existing model. Instead of focusing on complex visual details, we demonstrate that consistent dynamic transition modeling is the key to building powerful world simulators. Building upon this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Computational Physics and Python Applications · Human Motion and Animation
MethodsDiffusion
