GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model
Yongjie Fu, Yunlong Li, and Xuan Di

TL;DR
GenDDS leverages advanced diffusion models and descriptive prompts to generate diverse, realistic driving videos, enhancing training datasets for autonomous vehicles and addressing the scarcity of rare scenario data.
Contribution
This work introduces a novel pipeline combining Stable Diffusion XL, ControlNet, and Hotshot-XL for high-quality, diverse driving scenario video generation from prompts.
Findings
Generated videos closely mimic real-world driving complexity.
The approach produces diverse scenarios including rare traffic conditions.
High-quality videos suitable for autonomous driving training.
Abstract
Autonomous driving training requires a diverse range of datasets encompassing various traffic conditions, weather scenarios, and road types. Traditional data augmentation methods often struggle to generate datasets that represent rare occurrences. To address this challenge, we propose GenDDS, a novel approach for generating driving scenarios generation by leveraging the capabilities of Stable Diffusion XL (SDXL), an advanced latent diffusion model. Our methodology involves the use of descriptive prompts to guide the synthesis process, aimed at producing realistic and diverse driving scenarios. With the power of the latest computer vision techniques, such as ControlNet and Hotshot-XL, we have built a complete pipeline for video generation together with SDXL. We employ the KITTI dataset, which includes real-world driving videos, to train the model. Through a series of experiments, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Advanced Vision and Imaging
MethodsDiffusion
