Pre-Trained Video Generative Models as World Simulators

Haoran He; Yang Zhang; Liang Lin; Zhongwen Xu; Ling Pan

arXiv:2502.07825·cs.CV·February 13, 2025

Pre-Trained Video Generative Models as World Simulators

Haoran He, Yang Zhang, Liang Lin, Zhongwen Xu, Ling Pan

PDF

Open Access

TL;DR

This paper introduces Dynamic World Simulation (DWS), a method to convert pre-trained video generative models into controllable world simulators that accurately model dynamic actions and transitions for applications in games and robotics.

Contribution

The paper presents a universal action-conditioned module and a motion-reinforced loss to enable pre-trained models to simulate dynamic worlds with controllable actions.

Findings

01

DWS improves action controllability in video generation.

02

The approach enhances dynamic consistency across different model types.

03

Applications in reinforcement learning show improved sample efficiency.

Abstract

Video generative models pre-trained on large-scale internet datasets have achieved remarkable success, excelling at producing realistic synthetic videos. However, they often generate clips based on static prompts (e.g., text or images), limiting their ability to model interactive and dynamic scenarios. In this paper, we propose Dynamic World Simulation (DWS), a novel approach to transform pre-trained video generative models into controllable world simulators capable of executing specified action trajectories. To achieve precise alignment between conditioned actions and generated visual changes, we introduce a lightweight, universal action-conditioned module that seamlessly integrates into any existing model. Instead of focusing on complex visual details, we demonstrate that consistent dynamic transition modeling is the key to building powerful world simulators. Building upon this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Computational Physics and Python Applications · Human Motion and Animation

MethodsDiffusion