MonoSAOD: Monocular 3D Object Detection with Sparsely Annotated Label

Junyoung Jung; Seokwon Kim; Jung Uk Kim

arXiv:2604.01646·cs.CV·April 7, 2026

MonoSAOD: Monocular 3D Object Detection with Sparsely Annotated Label

Junyoung Jung, Seokwon Kim, Jung Uk Kim

PDF

1 Repo

TL;DR

MonoSAOD introduces a novel framework for monocular 3D object detection in sparsely annotated datasets, utilizing patch augmentation and prototype-guided pseudo-labeling to improve detection accuracy.

Contribution

It proposes two innovative modules, RAPA and PBF, to effectively leverage sparse annotations for monocular 3D detection, addressing annotation cost issues.

Findings

01

Significant performance improvement on sparse datasets

02

Effective pseudo-label filtering with prototype similarity

03

Robust detection with geometry-preserving augmentation

Abstract

Monocular 3D object detection has achieved impressive performance on densely annotated datasets. However, it struggles when only a fraction of objects are labeled due to the high cost of 3D annotation. This sparsely annotated setting is common in real-world scenarios where annotating every object is impractical. To address this, we propose a novel framework for sparsely annotated monocular 3D object detection with two key modules. First, we propose Road-Aware Patch Augmentation (RAPA), which leverages sparse annotations by augmenting segmented object patches onto road regions while preserving 3D geometric consistency. Second, we propose Prototype-Based Filtering (PBF), which generates high-quality pseudo-labels by filtering predictions through prototype similarity and depth uncertainty. It maintains global 2D RoI feature prototypes and selects pseudo-labels that are both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

VisualAIKHU/MonoSAOD
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.