FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection

Zheng Jiang; Jinqing Zhang; Yanan Zhang; Qingjie Liu; Zhenghui Hu,; Baohui Wang; Yunhong Wang

arXiv:2407.10135·cs.CV·July 16, 2024

FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection

Zheng Jiang, Jinqing Zhang, Yanan Zhang, Qingjie Liu, Zhenghui Hu,, Baohui Wang, Yunhong Wang

PDF

Open Access 1 Repo

TL;DR

FSD-BEV introduces a novel foreground self-distillation approach combined with point cloud intensification and multi-scale feature enhancement to improve multi-view 3D object detection performance without relying on pre-trained teacher models.

Contribution

The paper proposes a self-distillation scheme that avoids distribution discrepancies, along with point cloud intensification and multi-scale foreground enhancement, advancing multi-view 3D detection methods.

Findings

01

Achieves state-of-the-art results on nuScenes dataset.

02

Effectively improves detection accuracy without pre-trained teachers.

03

Demonstrates robustness across various detection scenarios.

Abstract

Although multi-view 3D object detection based on the Bird's-Eye-View (BEV) paradigm has garnered widespread attention as an economical and deployment-friendly perception solution for autonomous driving, there is still a performance gap compared to LiDAR-based methods. In recent years, several cross-modal distillation methods have been proposed to transfer beneficial information from teacher models to student models, with the aim of enhancing performance. However, these methods face challenges due to discrepancies in feature distribution originating from different data modalities and network structures, making knowledge transfer exceptionally challenging. In this paper, we propose a Foreground Self-Distillation (FSD) scheme that effectively avoids the issue of distribution discrepancies, maintaining remarkable distillation effects without the need for pre-trained teacher models or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cocoboom/fsd-bev
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques

MethodsSoftmax · Attention Is All You Need