MonoDistill: Learning Spatial Features for Monocular 3D Object Detection

Zhiyu Chong; Xinzhu Ma; Hong Zhang; Yuxin Yue; Haojie Li; Zhihui Wang,; Wanli Ouyang

arXiv:2201.10830·cs.CV·January 27, 2022·59 cites

MonoDistill: Learning Spatial Features for Monocular 3D Object Detection

Zhiyu Chong, Xinzhu Ma, Hong Zhang, Yuxin Yue, Haojie Li, Zhihui Wang,, Wanli Ouyang

PDF

Open Access 1 Repo 1 Video

TL;DR

MonoDistill introduces a novel knowledge transfer approach from LiDAR signals to monocular 3D object detectors, significantly improving their accuracy without extra inference costs, and achieves top performance on the KITTI benchmark.

Contribution

The paper presents a simple scheme to incorporate LiDAR-derived spatial information into monocular detectors via knowledge distillation, enhancing 3D detection accuracy.

Findings

01

Significant performance boost on KITTI benchmark.

02

Effective knowledge transfer from LiDAR signals to monocular models.

03

Validated through extensive ablation studies.

Abstract

3D object detection is a fundamental and challenging task for 3D scene understanding, and the monocular-based methods can serve as an economical alternative to the stereo-based or LiDAR-based methods. However, accurately detecting objects in the 3D space from a single image is extremely difficult due to the lack of spatial cues. To mitigate this issue, we propose a simple and effective scheme to introduce the spatial information from LiDAR signals to the monocular 3D detectors, without introducing any extra cost in the inference phase. In particular, we first project the LiDAR signals into the image plane and align them with the RGB images. After that, we use the resulting data to train a 3D detector (LiDAR Net) with the same architecture as the baseline model. Finally, this LiDAR Net can serve as the teacher to transfer the learned knowledge to the baseline model. Experimental results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

monster-ghost/monodistill
pytorchOfficial

Videos

MonoDistill: Learning Spatial Features for Monocular 3D Object Detection· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Industrial Vision Systems and Defect Detection