FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection
Zheng Jiang, Jinqing Zhang, Yanan Zhang, Qingjie Liu, Zhenghui Hu,, Baohui Wang, Yunhong Wang

TL;DR
FSD-BEV introduces a novel foreground self-distillation approach combined with point cloud intensification and multi-scale feature enhancement to improve multi-view 3D object detection performance without relying on pre-trained teacher models.
Contribution
The paper proposes a self-distillation scheme that avoids distribution discrepancies, along with point cloud intensification and multi-scale foreground enhancement, advancing multi-view 3D detection methods.
Findings
Achieves state-of-the-art results on nuScenes dataset.
Effectively improves detection accuracy without pre-trained teachers.
Demonstrates robustness across various detection scenarios.
Abstract
Although multi-view 3D object detection based on the Bird's-Eye-View (BEV) paradigm has garnered widespread attention as an economical and deployment-friendly perception solution for autonomous driving, there is still a performance gap compared to LiDAR-based methods. In recent years, several cross-modal distillation methods have been proposed to transfer beneficial information from teacher models to student models, with the aim of enhancing performance. However, these methods face challenges due to discrepancies in feature distribution originating from different data modalities and network structures, making knowledge transfer exceptionally challenging. In this paper, we propose a Foreground Self-Distillation (FSD) scheme that effectively avoids the issue of distribution discrepancies, maintaining remarkable distillation effects without the need for pre-trained teacher models or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques
MethodsSoftmax · Attention Is All You Need
