Diffusion-Based Particle-DETR for BEV Perception
Asen Nachkov, Martin Danelljan, Danda Pani Paudel, Luc Van Gool

TL;DR
This paper introduces a diffusion-based DETR model for BEV perception in autonomous vehicles, improving small object detection and uncertainty modeling by combining diffusion paradigms with 3D detectors.
Contribution
It proposes a novel diffusion-based DETR model with object query interpolation, enhancing small object detection and uncertainty modeling in BEV perception.
Findings
Achieves equal or better performance than state-of-the-art deterministic methods on NuScenes.
Effectively detects small objects in large BEV coverage.
Demonstrates the effectiveness of diffusion-based uncertainty modeling.
Abstract
The Bird-Eye-View (BEV) is one of the most widely-used scene representations for visual perception in Autonomous Vehicles (AVs) due to its well suited compatibility to downstream tasks. For the enhanced safety of AVs, modeling perception uncertainty in BEV is crucial. Recent diffusion-based methods offer a promising approach to uncertainty modeling for visual perception but fail to effectively detect small objects in the large coverage of the BEV. Such degradation of performance can be attributed primarily to the specific network architectures and the matching strategy used when training. Here, we address this problem by combining the diffusion paradigm with current state-of-the-art 3D object detectors in BEV. We analyze the unique challenges of this approach, which do not exist with deterministic detectors, and present a simple technique based on object query interpolation that allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Cerebrospinal fluid and hydrocephalus
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Label Smoothing · Adam · Residual Connection · Layer Normalization · Dropout
