Weakly Supervised Monocular 3D Object Detection using Multi-View Projection and Direction Consistency
Runzhou Tao, Wencheng Han, Zhongying Qiu, Cheng-zhong Xu, Jianbing, Shen

TL;DR
This paper introduces a weakly supervised monocular 3D object detection approach that uses only 2D image labels, leveraging multi-view and direction consistency to achieve performance comparable to fully supervised methods.
Contribution
It proposes a novel weakly supervised architecture utilizing projection, multi-view, and direction consistency, reducing reliance on 3D labels for training.
Findings
Achieves comparable performance to some fully supervised methods.
Pre-training with this method outperforms fully supervised baselines with fewer 3D labels.
Uses a new 2D direction labeling technique for rotation prediction.
Abstract
Monocular 3D object detection has become a mainstream approach in automatic driving for its easy application. A prominent advantage is that it does not need LiDAR point clouds during the inference. However, most current methods still rely on 3D point cloud data for labeling the ground truths used in the training phase. This inconsistency between the training and inference makes it hard to utilize the large-scale feedback data and increases the data collection expenses. To bridge this gap, we propose a new weakly supervised monocular 3D objection detection method, which can train the model with only 2D labels marked on images. To be specific, we explore three types of consistency in this task, i.e. the projection, multi-view and direction consistency, and design a weakly-supervised architecture based on these consistencies. Moreover, we propose a new 2D direction labeling method in this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · 3D Surveying and Cultural Heritage
