Physical Informed Driving World Model
Zhuoran Yang, Xi Guo, Chenjing Ding, Chiyu Wang, Wei Wu

TL;DR
DrivePhysica is a physics-informed generative model for multi-view driving videos that enhances realism and consistency by integrating physical principles into the generation process, improving downstream perception tasks.
Contribution
The paper introduces DrivePhysica, a novel model that incorporates physical principles into driving video generation through three modules, advancing realism and spatial-temporal consistency.
Findings
Achieves state-of-the-art FID and FVD scores on Nuscenes dataset.
Improves downstream perception task performance.
Ensures realistic motion and occlusion handling in generated videos.
Abstract
Autonomous driving requires robust perception models trained on high-quality, large-scale multi-view driving videos for tasks like 3D object detection, segmentation and trajectory prediction. While world models provide a cost-effective solution for generating realistic driving videos, challenges remain in ensuring these videos adhere to fundamental physical principles, such as relative and absolute motion, spatial relationship like occlusion and spatial consistency, and temporal consistency. To address these, we propose DrivePhysica, an innovative model designed to generate realistic multi-view driving videos that accurately adhere to essential physical principles through three key advancements: (1) a Coordinate System Aligner module that integrates relative and absolute motion features to enhance motion interpretation, (2) an Instance Flow Guidance module that ensures precise temporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques
