Physical Informed Driving World Model

Zhuoran Yang; Xi Guo; Chenjing Ding; Chiyu Wang; Wei Wu

arXiv:2412.08410·cs.CV·December 16, 2024

Physical Informed Driving World Model

Zhuoran Yang, Xi Guo, Chenjing Ding, Chiyu Wang, Wei Wu

PDF

Open Access

TL;DR

DrivePhysica is a physics-informed generative model for multi-view driving videos that enhances realism and consistency by integrating physical principles into the generation process, improving downstream perception tasks.

Contribution

The paper introduces DrivePhysica, a novel model that incorporates physical principles into driving video generation through three modules, advancing realism and spatial-temporal consistency.

Findings

01

Achieves state-of-the-art FID and FVD scores on Nuscenes dataset.

02

Improves downstream perception task performance.

03

Ensures realistic motion and occlusion handling in generated videos.

Abstract

Autonomous driving requires robust perception models trained on high-quality, large-scale multi-view driving videos for tasks like 3D object detection, segmentation and trajectory prediction. While world models provide a cost-effective solution for generating realistic driving videos, challenges remain in ensuring these videos adhere to fundamental physical principles, such as relative and absolute motion, spatial relationship like occlusion and spatial consistency, and temporal consistency. To address these, we propose DrivePhysica, an innovative model designed to generate realistic multi-view driving videos that accurately adhere to essential physical principles through three key advancements: (1) a Coordinate System Aligner module that integrates relative and absolute motion features to enhance motion interpretation, (2) an Instance Flow Guidance module that ensures precise temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic Prediction and Management Techniques