DifFUSER: Diffusion Model for Robust Multi-Sensor Fusion in 3D Object   Detection and BEV Segmentation

Duy-Tho Le; Hengcan Shi; Jianfei Cai; Hamid Rezatofighi

arXiv:2404.04629·cs.CV·September 25, 2024·1 cites

DifFUSER: Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation

Duy-Tho Le, Hengcan Shi, Jianfei Cai, Hamid Rezatofighi

PDF

Open Access

TL;DR

DifFUSER introduces a diffusion model-based multi-sensor fusion approach for 3D object detection and BEV segmentation, enhancing robustness and performance, especially under sensor failure conditions.

Contribution

The paper presents DifFUSER, a novel diffusion model framework with hierarchical architecture and training paradigms that improve multi-sensor fusion robustness and accuracy.

Findings

01

Achieves 70.04% mIOU in BEV segmentation on Nuscenes.

02

Outperforms existing methods in sensor failure scenarios.

03

Competitive with transformer-based fusion techniques.

Abstract

Diffusion models have recently gained prominence as powerful deep generative models, demonstrating unmatched performance across various domains. However, their potential in multi-sensor fusion remains largely unexplored. In this work, we introduce DifFUSER, a novel approach that leverages diffusion models for multi-modal fusion in 3D object detection and BEV map segmentation. Benefiting from the inherent denoising property of diffusion, DifFUSER is able to refine or even synthesize sensor features in case of sensor malfunction, thereby improving the quality of the fused output. In terms of architecture, our DifFUSER blocks are chained together in a hierarchical BiFPN fashion, termed cMini-BiFPN, offering an alternative architecture for latent diffusion. We further introduce a Gated Self-conditioned Modulated (GSM) latent diffusion module together with a Progressive Sensor Dropout…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Sensor Dropout or SensD · Pointwise Convolution · Batch Normalization · Depthwise Convolution · Depthwise Separable Convolution · Dropout · BiFPN · Diffusion