TL;DR
This paper introduces a Cross-Modality Knowledge Distillation network that effectively transfers LiDAR knowledge to monocular 3D detection, achieving state-of-the-art results through end-to-end training and semi-supervised learning.
Contribution
It proposes a novel end-to-end framework for knowledge transfer from LiDAR to images and extends it with semi-supervised learning to improve monocular 3D detection.
Findings
Achieves 1st place on KITTI test set and Waymo validation set
Significant performance improvements over previous methods
Effective knowledge transfer from LiDAR to monocular detection
Abstract
Leveraging LiDAR-based detectors or real LiDAR point data to guide monocular 3D detection has brought significant improvement, e.g., Pseudo-LiDAR methods. However, the existing methods usually apply non-end-to-end training strategies and insufficiently leverage the LiDAR information, where the rich potential of the LiDAR data has not been well exploited. In this paper, we propose the Cross-Modality Knowledge Distillation (CMKD) network for monocular 3D detection to efficiently and directly transfer the knowledge from LiDAR modality to image modality on both features and responses. Moreover, we further extend CMKD as a semi-supervised training framework by distilling knowledge from large-scale unlabeled data and significantly boost the performance. Until submission, CMKD ranks among the monocular 3D detectors with publications on both KITTI set and Waymo set with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation
