Panacea: Panoramic and Controllable Video Generation for Autonomous   Driving

Yuqing Wen; Yucheng Zhao; Yingfei Liu; Fan Jia; Yanhui Wang; Chong; Luo; Chi Zhang; Tiancai Wang; Xiaoyan Sun; Xiangyu Zhang

arXiv:2311.16813·cs.CV·November 29, 2023·1 cites

Panacea: Panoramic and Controllable Video Generation for Autonomous Driving

Yuqing Wen, Yucheng Zhao, Yingfei Liu, Fan Jia, Yanhui Wang, Chong, Luo, Chi Zhang, Tiancai Wang, Xiaoyan Sun, Xiangyu Zhang

PDF

Open Access 1 Repo

TL;DR

Panacea is a novel method for generating high-quality, panoramic, and controllable driving videos that enhance autonomous vehicle training datasets by ensuring coherence and alignment with annotations.

Contribution

It introduces a new approach combining 4D attention, a two-stage pipeline, and ControlNet for controllable, coherent panoramic video generation in driving scenarios.

Findings

01

Effective generation of diverse, annotated driving videos.

02

Maintains temporal and cross-view consistency.

03

Improves autonomous driving perception models.

Abstract

The field of autonomous driving increasingly demands high-quality annotated training data. In this paper, we propose Panacea, an innovative approach to generate panoramic and controllable videos in driving scenarios, capable of yielding an unlimited numbers of diverse, annotated samples pivotal for autonomous driving advancements. Panacea addresses two critical challenges: 'Consistency' and 'Controllability.' Consistency ensures temporal and cross-view coherence, while Controllability ensures the alignment of generated content with corresponding annotations. Our approach integrates a novel 4D attention and a two-stage generation pipeline to maintain coherence, supplemented by the ControlNet framework for meticulous control by the Bird's-Eye-View (BEV) layouts. Extensive qualitative and quantitative evaluations of Panacea on the nuScenes dataset prove its effectiveness in generating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wenyuqing/panacea
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Vision and Imaging · Multimodal Machine Learning Applications