Semi-Dense 3D Semantic Mapping from Monocular SLAM
Xuanpeng Li, Rachid Belaroussi

TL;DR
This paper presents a method for semi-dense 3D semantic mapping using monocular SLAM combined with deep learning, enabling efficient semantic labeling in 3D environments without dense per-frame segmentation.
Contribution
It introduces a novel approach that transfers 2D semantic information to 3D maps via keyframe correspondence, reducing computational load and improving semantic labeling accuracy.
Findings
Improved 2D semantic labeling over baseline methods.
Effective semantic mapping in both indoor and outdoor scenes.
No need for dense per-frame segmentation, saving computation time.
Abstract
The bundle of geometry and appearance in computer vision has proven to be a promising solution for robots across a wide variety of applications. Stereo cameras and RGB-D sensors are widely used to realise fast 3D reconstruction and trajectory tracking in a dense way. However, they lack flexibility of seamless switch between different scaled environments, i.e., indoor and outdoor scenes. In addition, semantic information are still hard to acquire in a 3D mapping. We address this challenge by combining the state-of-art deep learning method and semi-dense Simultaneous Localisation and Mapping (SLAM) based on video stream from a monocular camera. In our approach, 2D semantic information are transferred to 3D mapping via correspondence between connective Keyframes with spatial consistency. There is no need to obtain a semantic segmentation for each frame in a sequence, so that it could…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · 3D Surveying and Cultural Heritage · Advanced Image and Video Retrieval Techniques
