Efficient Stereo Depth Estimation for Pseudo LiDAR: A Self-Supervised Approach Based on Multi-Input ResNet Encoder
Sabir Hossain, Xianke Lin

TL;DR
This paper introduces a self-supervised method using a multi-input ResNet encoder for efficient stereo depth estimation to generate pseudo LiDAR point clouds in real-time, improving accuracy and speed for autonomous vehicle perception.
Contribution
It presents a novel self-supervised stereo depth estimation approach that leverages multi-input ResNet encoders to produce accurate and fast pseudo LiDAR point clouds from stereo images.
Findings
Outperforms existing methods on KITTI benchmark
Generates point clouds significantly faster
Achieves more accurate depth estimation
Abstract
Perception and localization are essential for autonomous delivery vehicles, mostly estimated from 3D LiDAR sensors due to their precise distance measurement capability. This paper presents a strategy to obtain the real-time pseudo point cloud instead of the laser sensor from the image sensor. We propose an approach to use different depth estimators to obtain pseudo point clouds like LiDAR to obtain better performance. Moreover, the training and validating strategy of the depth estimator has adopted stereo imagery data to estimate more accurate depth estimation as well as point cloud results. Our approach to generating depth maps outperforms on KITTI benchmark while yielding point clouds significantly faster than other approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Optical measurement and interference techniques
