TL;DR
FPS-Net introduces a novel convolutional fusion network that effectively exploits the unique characteristics of different point cloud modalities for improved large-scale LiDAR point cloud segmentation in autonomous driving.
Contribution
The paper proposes a modality-specific feature learning and fusion approach within an encoder-decoder framework, enhancing segmentation accuracy over existing projection-based methods.
Findings
FPS-Net outperforms state-of-the-art methods on benchmark datasets.
Modality grouping and fusion improve segmentation accuracy.
The approach is compatible with existing projection-based segmentation methods.
Abstract
Scene understanding based on LiDAR point cloud is an essential task for autonomous cars to drive safely, which often employs spherical projection to map 3D point cloud into multi-channel 2D images for semantic segmentation. Most existing methods simply stack different point attributes/modalities (e.g. coordinates, intensity, depth, etc.) as image channels to increase information capacity, but ignore distinct characteristics of point attributes in different image channels. We design FPS-Net, a convolutional fusion network that exploits the uniqueness and discrepancy among the projected image channels for optimal point cloud segmentation. FPS-Net adopts an encoder-decoder structure. Instead of simply stacking multiple channel images as a single input, we group them into different modalities to first learn modality-specific features separately and then map the learned features into a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution · Batch Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Dense Block
