SC3D: Label-Efficient Outdoor 3D Object Detection via Single Click Annotation
Qiming Xia, Hongwei Lin, Wei Ye, Hai Wu, Yadan Luo, Cheng Wang,, Chenglu Wen

TL;DR
SC3D introduces a highly label-efficient 3D object detection method that uses only a single coarse click per object, significantly reducing annotation effort while achieving state-of-the-art results.
Contribution
The paper proposes a novel progressive pipeline with pseudo-labeling and mixed supervision to enable effective 3D detection from minimal click annotations.
Findings
Achieves state-of-the-art performance with only 0.2% annotation cost.
Outperforms existing weakly-supervised methods on nuScenes and KITTI datasets.
Demonstrates the effectiveness of single-click annotations for outdoor 3D detection.
Abstract
LiDAR-based outdoor 3D object detection has received widespread attention. However, training 3D detectors from the LiDAR point cloud typically relies on expensive bounding box annotations. This paper presents SC3D, an innovative label-efficient method requiring only a single coarse click on the bird's eye view of the 3D point cloud for each frame. A key challenge here is the absence of complete geometric descriptions of the target objects from such simple click annotations. To address this issue, our proposed SC3D adopts a progressive pipeline. Initially, we design a mixed pseudo-label generation module that expands limited click annotations into a mixture of bounding box and semantic mask supervision. Next, we propose a mix-supervised teacher model, enabling the detector to learn mixed supervision information. Finally, we introduce a mixed-supervised student network that leverages the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
