OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors Reasoning
Xusheng Guo, Wanfa Zhang, Shijia Zhao, Qiming Xia, Xiaolong Xie, Mingming Wang, Hai Wu, Chenglu Wen

TL;DR
This paper introduces OWL, an unsupervised 3D object detection framework that uses occupancy-guided warm-up, large-model priors, and dynamic self-training to improve detection accuracy without annotations.
Contribution
OWL combines occupancy-guided warm-up, large-model prior reasoning, and adaptive self-training to enhance unsupervised 3D object detection performance.
Findings
Outperforms state-of-the-art methods by over 15% mAP on WOD and KITTI datasets.
Effectively filters and refines pseudo-labels using large-model priors.
Improves backbone initialization with occupancy-guided warm-up.
Abstract
Unsupervised 3D object detection leverages heuristic algorithms to discover potential objects, offering a promising route to reduce annotation costs in autonomous driving. Existing approaches mainly generate pseudo labels and refine them through self-training iterations. However, these pseudo-labels are often incorrect at the beginning of training, resulting in misleading the optimization process. Moreover, effectively filtering and refining them remains a critical challenge. In this paper, we propose OWL for unsupervised 3D object detection by occupancy guided warm-up and large-model priors reasoning. OWL first employs an Occupancy Guided Warm-up (OGW) strategy to initialize the backbone weight with spatial perception capabilities, mitigating the interference of incorrect pseudo-labels on network convergence. Furthermore, OWL introduces an Instance-Cued Reasoning (ICR) module that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Autonomous Vehicle Technology and Safety
