TL;DR
SBEVNet is an end-to-end deep learning framework that estimates bird's eye view layouts from stereo images, combining disparity features and inverse perspective mapping to improve spatial understanding for robotics applications.
Contribution
The paper introduces SBEVNet, a novel approach that bypasses explicit depth estimation by learning effective bird's eye view features for layout estimation.
Findings
Achieves state-of-the-art results on KITTI and CARLA datasets.
Effectively combines disparity features with IPM for detailed scene understanding.
Improves layout estimation accuracy over baseline methods.
Abstract
Accurate layout estimation is crucial for planning and navigation in robotics applications, such as self-driving. In this paper, we introduce the Stereo Bird's Eye ViewNetwork (SBEVNet), a novel supervised end-to-end framework for estimation of bird's eye view layout from a pair of stereo images. Although our network reuses some of the building blocks from the state-of-the-art deep learning networks for disparity estimation, we show that explicit depth estimation is neither sufficient nor necessary. Instead, the learning of a good internal bird's eye view feature representation is effective for layout estimation. Specifically, we first generate a disparity feature volume using the features of the stereo images and then project it to the bird's eye view coordinates. This gives us coarse-grained information about the scene structure. We also apply inverse perspective mapping (IPM) to map…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
SBEVNet: End-to-End Deep Stereo Layout Estimation· youtube
Taxonomy
MethodsEntropy Regularization · Proximal Policy Optimization · CARLA: An Open Urban Driving Simulator
