Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models
Xuanchi Ren, Yifan Lu, Tianshi Cao, Ruiyuan Gao, Shengyu Huang, Amirmojtaba Sabour, Tianchang Shen, Tobias Pfaff, Jay Zhangjie Wu, Runjian Chen, Seung Wook Kim, Jun Gao, Laura Leal-Taixe, Mike Chen, Sanja Fidler, Huan Ling

TL;DR
Cosmos-Drive-Dreams introduces a synthetic data generation pipeline powered by specialized world foundation models to create diverse, high-fidelity driving scenarios, improving autonomous vehicle training and testing by addressing data scarcity and edge cases.
Contribution
The paper presents a novel scalable synthetic data generation pipeline using NVIDIA's Cosmos world foundation models tailored for the driving domain, enabling controllable, multi-view, and spatiotemporally consistent driving video synthesis.
Findings
Generated data improves model generalization in downstream tasks.
Synthetic data helps mitigate long-tail distribution issues.
Pipeline and models are open-sourced for community use.
Abstract
Collecting and annotating real-world data for safety-critical physical AI systems, such as Autonomous Vehicle (AV), is time-consuming and costly. It is especially challenging to capture rare edge cases, which play a critical role in training and testing of an AV system. To address this challenge, we introduce the Cosmos-Drive-Dreams - a synthetic data generation (SDG) pipeline that aims to generate challenging scenarios to facilitate downstream tasks such as perception and driving policy training. Powering this pipeline is Cosmos-Drive, a suite of models specialized from NVIDIA Cosmos world foundation model for the driving domain and are capable of controllable, high-fidelity, multi-view, and spatiotemporally consistent driving video generation. We showcase the utility of these models by applying Cosmos-Drive-Dreams to scale the quantity and diversity of driving datasets with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications
