Enhancing 3D LiDAR Segmentation by Shaping Dense and Accurate 2D Semantic Predictions
Xiaoyu Dong, Tiankui Xian, Wanshui Gan, Naoto Yokoya

TL;DR
This paper introduces MM2D3D, a multi-modal segmentation model that enhances 3D LiDAR point cloud segmentation by shaping dense, accurate 2D semantic predictions using camera images, leading to improved 3D accuracy.
Contribution
The paper proposes a novel multi-modal approach with cross-modal guided filtering and dynamic pseudo supervision to improve 2D and 3D semantic segmentation accuracy.
Findings
Achieves denser, more accurate 2D semantic predictions
Significantly improves 3D LiDAR segmentation performance
Outperforms previous methods in both 2D and 3D metrics
Abstract
Semantic segmentation of 3D LiDAR point clouds is important in urban remote sensing for understanding real-world street environments. This task, by projecting LiDAR point clouds and 3D semantic labels as sparse maps, can be reformulated as a 2D problem. However, the intrinsic sparsity of the projected LiDAR and label maps can result in sparse and inaccurate intermediate 2D semantic predictions, which in return limits the final 3D accuracy. To address this issue, we enhance this task by shaping dense and accurate 2D predictions. Specifically, we develop a multi-modal segmentation model, MM2D3D. By leveraging camera images as auxiliary data, we introduce cross-modal guided filtering to overcome label map sparsity by constraining intermediate 2D semantic predictions with dense semantic relations derived from the camera images; and we introduce dynamic cross pseudo supervision to overcome…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote Sensing and LiDAR Applications · Advanced Neural Network Applications · 3D Shape Modeling and Analysis
