MonoCLUE : Object-Aware Clustering Enhances Monocular 3D Object Detection

Sunghun Yang; Minhyeok Lee; Jungho Lee; Sangyoun Lee

arXiv:2511.07862·cs.CV·November 12, 2025

MonoCLUE : Object-Aware Clustering Enhances Monocular 3D Object Detection

Sunghun Yang, Minhyeok Lee, Jungho Lee, Sangyoun Lee

PDF

Open Access 1 Video

TL;DR

MonoCLUE enhances monocular 3D object detection by combining local clustering of visual features with a generalized scene memory, improving detection accuracy in occluded and truncated scenes, and achieving state-of-the-art results on KITTI.

Contribution

It introduces a novel approach that leverages object-aware clustering and scene memory to improve monocular 3D detection robustness and accuracy.

Findings

01

Achieves state-of-the-art performance on KITTI benchmark.

02

Improves detection of partially visible objects.

03

Enhances robustness in occluded and limited visibility scenarios.

Abstract

Monocular 3D object detection offers a cost-effective solution for autonomous driving but suffers from ill-posed depth and limited field of view. These constraints cause a lack of geometric cues and reduced accuracy in occluded or truncated scenes. While recent approaches incorporate additional depth information to address geometric ambiguity, they overlook the visual cues crucial for robust recognition. We propose MonoCLUE, which enhances monocular 3D detection by leveraging both local clustering and generalized scene memory of visual features. First, we perform K-means clustering on visual features to capture distinct object-level appearance parts (e.g., bonnet, car roof), improving detection of partially visible objects. The clustered features are propagated across regions to capture objects with similar appearances. Second, we construct a generalized scene memory by aggregating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MonoCLUE: Object-Aware Clustering Enhances Monocular 3D Object Detection· underline

Taxonomy

TopicsAdvanced Neural Network Applications · Face recognition and analysis · Domain Adaptation and Few-Shot Learning