Recursive Cross-View: Use Only 2D Detectors to Achieve 3D Object Detection without 3D Annotations
Shun Gui, Yan Luximon

TL;DR
This paper introduces Recursive Cross-View (RCV), a novel 3D object detection method that relies solely on 2D detectors and no 3D annotations, converting 3D detection into multiple 2D tasks and achieving real-time performance.
Contribution
The proposed RCV method is the first to generate fully oriented 3D bounding boxes without requiring 3D labels, using a recursive paradigm based on 2D detection and cross-view principles.
Findings
Outperforms existing image-based 3D detection methods on SUN RGB-D and KITTI datasets.
Successfully applied to 3D human and hand detection, creating new annotated datasets.
Operates at 7 fps on live RGB-D streams, enabling real-time applications.
Abstract
Heavily relying on 3D annotations limits the real-world application of 3D object detection. In this paper, we propose a method that does not demand any 3D annotation, while being able to predict fully oriented 3D bounding boxes. Our method, called Recursive Cross-View (RCV), utilizes the three-view principle to convert 3D detection into multiple 2D detection tasks, requiring only a subset of 2D labels. We propose a recursive paradigm, in which instance segmentation and 3D bounding box generation by Cross-View are implemented recursively until convergence. Specifically, our proposed method involves the use of a frustum for each 2D bounding box, which is then followed by the recursive paradigm that ultimately generates a fully oriented 3D box, along with its corresponding class and score. Note that, class and score are given by the 2D detector. Estimated on the SUN RGB-D and KITTI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Hand Gesture Recognition Systems
