Vision-based Lifting of 2D Object Detections for Automated Driving
Hendrik K\"onigshof, Kun Li, Christoph Stiller

TL;DR
This paper introduces a cost-effective vision-based pipeline that lifts 2D object detections from images to 3D for all road users, achieving competitive accuracy with significantly reduced computational cost.
Contribution
It presents a novel approach using 2D CNNs to process point clouds for 3D detection, extending capabilities to all road users and reducing runtime compared to existing methods.
Findings
Achieves comparable accuracy to state-of-the-art image-based 3D detectors.
Runs at only one-third of the computational cost of existing methods.
Effective for all types of road users, not just cars.
Abstract
Image-based 3D object detection is an inevitable part of autonomous driving because cheap onboard cameras are already available in most modern cars. Because of the accurate depth information, currently, most state-of-the-art 3D object detectors heavily rely on LiDAR data. In this paper, we propose a pipeline which lifts the results of existing vision-based 2D algorithms to 3D detections using only cameras as a cost-effective alternative to LiDAR. In contrast to existing approaches, we focus not only on cars but on all types of road users. To the best of our knowledge, we are the first using a 2D CNN to process the point cloud for each 2D detection to keep the computational effort as low as possible. Our evaluation on the challenging KITTI 3D object detection benchmark shows results comparable to state-of-the-art image-based approaches while having a runtime of only a third.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
