DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception
Jiayu Zou, Zheng Zhu, Yun Ye, Xingang Wang

TL;DR
DiffBEV introduces a novel diffusion model-based framework for bird's eye view perception in autonomous driving, significantly improving BEV feature quality by denoising noisy inputs and refining semantic features.
Contribution
This work is the first to apply diffusion models to BEV perception, designing conditional guidance and a cross-attention module to enhance BEV feature generation.
Findings
Achieves 25.9% mIoU on nuScenes, outperforming previous methods by 6.2%.
Demonstrates superior performance in BEV semantic segmentation.
Shows effectiveness in 3D object detection tasks.
Abstract
BEV perception is of great importance in the field of autonomous driving, serving as the cornerstone of planning, controlling, and motion prediction. The quality of the BEV feature highly affects the performance of BEV perception. However, taking the noises in camera parameters and LiDAR scans into consideration, we usually obtain BEV representation with harmful noises. Diffusion models naturally have the ability to denoise noisy samples to the ideal data, which motivates us to utilize the diffusion model to get a better BEV representation. In this work, we propose an end-to-end framework, named DiffBEV, to exploit the potential of diffusion model to generate a more comprehensive BEV representation. To the best of our knowledge, we are the first to apply diffusion model to BEV perception. In practice, we design three types of conditions to guide the training of the diffusion model which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
MethodsConcatenated Skip Connection · Diffusion · Softmax
