GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection
Weixin Mao, Jinrong Yang, Zheng Ge, Lin Song, Hongyu Zhou, Tiezheng, Mao, Zeming Li, Osamu Yoshie

TL;DR
This paper introduces GMM, a novel gradient-aware and model-perceive mining strategy that enhances depth prediction accuracy in monocular 3D detection, leading to state-of-the-art results on the nuScenes dataset.
Contribution
The paper proposes a new mining strategy for depth learning in monocular 3D detection, improving accuracy and general applicability across existing detectors.
Findings
GMM significantly improves 3D detection accuracy.
Achieves state-of-the-art performance on nuScenes benchmark.
Outperforms existing sample mining techniques.
Abstract
Depth perception is a crucial component of monoc-ular 3D detection tasks that typically involve ill-posed problems. In light of the success of sample mining techniques in 2D object detection, we propose a simple yet effective mining strategy for improving depth perception in 3D object detection. Concretely, we introduce a plain metric to evaluate the quality of depth predictions, which chooses the mined sample for the model. Moreover, we propose a Gradient-aware and Model-perceive Mining strategy (GMM) for depth learning, which exploits the predicted depth quality for better depth learning through easy mining. GMM is a general strategy that can be readily applied to several state-of-the-art monocular 3D detectors, improving the accuracy of depth prediction. Extensive experiments on the nuScenes dataset demonstrate that the proposed methods significantly improve the performance of 3D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Domain Adaptation and Few-Shot Learning
