Difficulty-Aware Label-Guided Denoising for Monocular 3D Object Detection
Soyul Lee, Seungmin Baek, Dongbo Min

TL;DR
MonoDLGD introduces a difficulty-aware label-guided denoising framework that adaptively perturbs and reconstructs ground-truth labels based on detection uncertainty, enhancing monocular 3D object detection performance.
Contribution
It is the first to incorporate instance-level difficulty awareness into label denoising for monocular 3D detection, improving robustness and accuracy.
Findings
Achieves state-of-the-art results on KITTI benchmark.
Effectively handles varying object difficulty levels.
Enhances geometry-aware representation learning.
Abstract
Monocular 3D object detection is a cost-effective solution for applications like autonomous driving and robotics, but remains fundamentally ill-posed due to inherently ambiguous depth cues. Recent DETR-based methods attempt to mitigate this through global attention and auxiliary depth prediction, yet they still struggle with inaccurate depth estimates. Moreover, these methods often overlook instance-level detection difficulty, such as occlusion, distance, and truncation, leading to suboptimal detection performance. We propose MonoDLGD, a novel Difficulty-Aware Label-Guided Denoising framework that adaptively perturbs and reconstructs ground-truth labels based on detection uncertainty. Specifically, MonoDLGD applies stronger perturbations to easier instances and weaker ones into harder cases, and then reconstructs them to effectively provide explicit geometric supervision. By jointly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Advanced Vision and Imaging
