MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection
Hou-I Liu, Christine Wu, Jen-Hao Cheng, Wenhao Chai, Shian-Yun Wang,, Gaowen Liu, Hugo Latapie, Jhih-Ciang Wu, Jenq-Neng Hwang, Hong-Han Shuai and, Wen-Huang Cheng

TL;DR
MonoTAKD introduces a novel knowledge distillation approach using a teaching assistant model to improve monocular 3D object detection, effectively transferring 3D visual knowledge from LiDAR-based models to camera-based models, achieving state-of-the-art results.
Contribution
The paper proposes MonoTAKD, a new distillation framework with a teaching assistant model and residual 3D cues, enhancing monocular 3D detection performance.
Findings
Achieves state-of-the-art results on KITTI3D dataset.
Demonstrates good generalization on nuScenes and KITTI raw datasets.
Effectively transfers knowledge from LiDAR-based models to camera-based models.
Abstract
Monocular 3D object detection (Mono3D) holds noteworthy promise for autonomous driving applications owing to the cost-effectiveness and rich visual context of monocular camera sensors. However, depth ambiguity poses a significant challenge, as it requires extracting precise 3D scene geometry from a single image, resulting in suboptimal performance when transferring knowledge from a LiDAR-based teacher model to a camera-based student model. To facilitate effective distillation, we introduce Monocular Teaching Assistant Knowledge Distillation (MonoTAKD), which proposes a camera-based teaching assistant (TA) model to transfer robust 3D visual knowledge to the student model, leveraging the smaller feature representation gap. Additionally, we define 3D spatial cues as residual features that capture the differences between the teacher and the TA models. We then leverage these cues to improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications
MethodsKnowledge Distillation
