GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video   Generative Model

Yongjie Fu; Yunlong Li; and Xuan Di

arXiv:2408.15868·cs.CV·August 29, 2024

GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model

Yongjie Fu, Yunlong Li, and Xuan Di

PDF

Open Access

TL;DR

GenDDS leverages advanced diffusion models and descriptive prompts to generate diverse, realistic driving videos, enhancing training datasets for autonomous vehicles and addressing the scarcity of rare scenario data.

Contribution

This work introduces a novel pipeline combining Stable Diffusion XL, ControlNet, and Hotshot-XL for high-quality, diverse driving scenario video generation from prompts.

Findings

01

Generated videos closely mimic real-world driving complexity.

02

The approach produces diverse scenarios including rare traffic conditions.

03

High-quality videos suitable for autonomous driving training.

Abstract

Autonomous driving training requires a diverse range of datasets encompassing various traffic conditions, weather scenarios, and road types. Traditional data augmentation methods often struggle to generate datasets that represent rare occurrences. To address this challenge, we propose GenDDS, a novel approach for generating driving scenarios generation by leveraging the capabilities of Stable Diffusion XL (SDXL), an advanced latent diffusion model. Our methodology involves the use of descriptive prompts to guide the synthesis process, aimed at producing realistic and diverse driving scenarios. With the power of the latest computer vision techniques, such as ControlNet and Hotshot-XL, we have built a complete pipeline for video generation together with SDXL. We employ the KITTI dataset, which includes real-world driving videos, to train the model. Through a series of experiments, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Advanced Vision and Imaging

MethodsDiffusion