TL;DR
SalsaNet is a fast, projection-agnostic deep learning network that efficiently segments roads and vehicles in LiDAR point clouds for autonomous driving, using an innovative auto-labeling process and BEV projection.
Contribution
The paper introduces SalsaNet, a novel encoder-decoder network for LiDAR segmentation, and presents an auto-labeling method to generate training data without manual annotation.
Findings
Outperforms state-of-the-art in accuracy and speed
Proves BEV projection is effective and projection-agnostic
Auto-labeling reduces manual annotation effort
Abstract
In this paper, we introduce a deep encoder-decoder network, named SalsaNet, for efficient semantic segmentation of 3D LiDAR point clouds. SalsaNet segments the road, i.e. drivable free-space, and vehicles in the scene by employing the Bird-Eye-View (BEV) image projection of the point cloud. To overcome the lack of annotated point cloud data, in particular for the road segments, we introduce an auto-labeling process which transfers automatically generated labels from the camera to LiDAR. We also explore the role of imagelike projection of LiDAR data in semantic segmentation by comparing BEV with spherical-front-view projection and show that SalsaNet is projection-agnostic. We perform quantitative and qualitative evaluations on the KITTI dataset, which demonstrate that the proposed SalsaNet outperforms other state-of-the-art semantic segmentation networks in terms of accuracy and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
