Structure-SLAM: Low-Drift Monocular SLAM in Indoor Environments
Yanyan Li, Nikolas Brasch, Yida Wang, Nassir Navab, Federico Tombari

TL;DR
This paper introduces a low-drift monocular SLAM system for indoor environments that decouples rotation and translation estimation, utilizing surface normals predicted by neural networks to improve accuracy and robustness.
Contribution
It presents a novel approach that combines neural network-based surface normal prediction with a decoupled rotation and translation estimation for improved indoor SLAM performance.
Findings
Outperforms state-of-the-art methods on ICL-NUIM and TUM RGB-D benchmarks.
Reduces long-term drift in indoor monocular SLAM.
Effectively leverages geometric scene information for pose estimation.
Abstract
In this paper a low-drift monocular SLAM method is proposed targeting indoor scenarios, where monocular SLAM often fails due to the lack of textured surfaces. Our approach decouples rotation and translation estimation of the tracking process to reduce the long-term drift in indoor environments. In order to take full advantage of the available geometric information in the scene, surface normals are predicted by a convolutional neural network from each input RGB image in real-time. First, a drift-free rotation is estimated based on lines and surface normals using spherical mean-shift clustering, leveraging the weak Manhattan World assumption. Then translation is computed from point and line features. Finally, the estimated poses are refined with a map-to-frame optimization strategy. The proposed method outperforms the state of the art on common SLAM benchmarks such as ICL-NUIM and TUM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · 3D Surveying and Cultural Heritage · Advanced Vision and Imaging
