IDEAL-M3D: Instance Diversity-Enriched Active Learning for Monocular 3D Detection

Johannes Meier; Florian G\"unther; Riccardo Marin; Oussema Dhaouadi; Jacques Kaiser; Daniel Cremers

arXiv:2511.19301·cs.CV·November 25, 2025

IDEAL-M3D: Instance Diversity-Enriched Active Learning for Monocular 3D Detection

Johannes Meier, Florian G\"unther, Riccardo Marin, Oussema Dhaouadi, Jacques Kaiser, Daniel Cremers

PDF

Open Access

TL;DR

IDEAL-M3D introduces an instance-level active learning approach for monocular 3D detection, leveraging diversity to efficiently select informative samples and reduce annotation costs while maintaining high detection performance.

Contribution

It is the first instance-level active learning pipeline for monocular 3D detection that uses a diverse ensemble to improve sample selection efficiency.

Findings

01

Achieves comparable or better AP3D with only 60% annotations.

02

Demonstrates the effectiveness of diversity-driven sample selection.

03

Reduces annotation costs significantly while maintaining detection accuracy.

Abstract

Monocular 3D detection relies on just a single camera and is therefore easy to deploy. Yet, achieving reliable 3D understanding from monocular images requires substantial annotation, and 3D labels are especially costly. To maximize performance under constrained labeling budgets, it is essential to prioritize annotating samples expected to deliver the largest performance gains. This prioritization is the focus of active learning. Curiously, we observed two significant limitations in active learning algorithms for 3D monocular object detection. First, previous approaches select entire images, which is inefficient, as non-informative instances contained in the same image also need to be labeled. Secondly, existing methods rely on uncertainty-based selection, which in monocular 3D object detection creates a bias toward depth ambiguity. Consequently, distant objects are selected, while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques