Depth-discriminative Metric Learning for Monocular 3D Object Detection
Wonhyeok Choi, Mingyu Shin, Sunghoon Im

TL;DR
This paper introduces a novel metric learning approach that enhances depth discrimination in monocular 3D object detection without increasing model complexity, significantly improving performance across benchmarks.
Contribution
A new depth-discriminative metric learning scheme with a (K, B, eps)-quasi-isometric loss and an auxiliary depth estimation head, improving depth quality without added inference cost.
Findings
Improves detection performance by 23.51% on KITTI
Enhances accuracy by 5.78% on Waymo
Maintains inference speed while boosting depth discrimination
Abstract
Monocular 3D object detection poses a significant challenge due to the lack of depth information in RGB images. Many existing methods strive to enhance the object depth estimation performance by allocating additional parameters for object depth estimation, utilizing extra modules or data. In contrast, we introduce a novel metric learning scheme that encourages the model to extract depth-discriminative features regardless of the visual attributes without increasing inference time and model size. Our method employs the distance-preserving function to organize the feature space manifold in relation to ground-truth object depth. The proposed (K, B, eps)-quasi-isometric loss leverages predetermined pairwise distance restriction as guidance for adjusting the distance among object descriptors without disrupting the non-linearity of the natural feature manifold. Moreover, we introduce an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image and Video Retrieval Techniques
