RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving
Peixuan Li, Shun Su, Huaici Zhao

TL;DR
RTS3D introduces a real-time stereo 3D detection method using a novel 4D feature-consistent embedding space, improving accuracy and efficiency over previous pseudo-LiDAR approaches without requiring depth supervision.
Contribution
The paper proposes a new 4D feature-consistent embedding space for stereo 3D detection that eliminates the need for depth supervision and enhances efficiency and accuracy.
Findings
Achieves over 24 FPS on KITTI benchmark.
Improves average precision by 10% over previous methods.
First real-time stereo 3D detection system.
Abstract
Although the recent image-based 3D object detection methods using Pseudo-LiDAR representation have shown great capabilities, a notable gap in efficiency and accuracy still exist compared with LiDAR-based methods. Besides, over-reliance on the stand-alone depth estimator, requiring a large number of pixel-wise annotations in the training stage and more computation in the inferencing stage, limits the scaling application in the real world. In this paper, we propose an efficient and accurate 3D object detection method from stereo images, named RTS3D. Different from the 3D occupancy space in the Pseudo-LiDAR similar methods, we design a novel 4D feature-consistent embedding (FCE) space as the intermediate representation of the 3D scene without depth supervision. The FCE space encodes the object's structural and semantic information by exploring the multi-scale feature consistency warped…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Vision and Imaging · Industrial Vision Systems and Defect Detection
