Difficulty-Aware Label-Guided Denoising for Monocular 3D Object Detection

Soyul Lee; Seungmin Baek; Dongbo Min

arXiv:2511.13195·cs.CV·November 18, 2025

Difficulty-Aware Label-Guided Denoising for Monocular 3D Object Detection

Soyul Lee, Seungmin Baek, Dongbo Min

PDF

Open Access 1 Video

TL;DR

MonoDLGD introduces a difficulty-aware label-guided denoising framework that adaptively perturbs and reconstructs ground-truth labels based on detection uncertainty, enhancing monocular 3D object detection performance.

Contribution

It is the first to incorporate instance-level difficulty awareness into label denoising for monocular 3D detection, improving robustness and accuracy.

Findings

01

Achieves state-of-the-art results on KITTI benchmark.

02

Effectively handles varying object difficulty levels.

03

Enhances geometry-aware representation learning.

Abstract

Monocular 3D object detection is a cost-effective solution for applications like autonomous driving and robotics, but remains fundamentally ill-posed due to inherently ambiguous depth cues. Recent DETR-based methods attempt to mitigate this through global attention and auxiliary depth prediction, yet they still struggle with inaccurate depth estimates. Moreover, these methods often overlook instance-level detection difficulty, such as occlusion, distance, and truncation, leading to suboptimal detection performance. We propose MonoDLGD, a novel Difficulty-Aware Label-Guided Denoising framework that adaptively perturbs and reconstructs ground-truth labels based on detection uncertainty. Specifically, MonoDLGD applies stronger perturbations to easier instances and weaker ones into harder cases, and then reconstructs them to effectively provide explicit geometric supervision. By jointly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Difficulty-Aware Label-Guided Denoising for Monocular 3D Object Detection· underline

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Advanced Vision and Imaging