Monocular Object and Plane SLAM in Structured Environments
Shichao Yang, Sebastian Scherer

TL;DR
This paper introduces a monocular SLAM method that leverages high-level object and plane landmarks to produce denser, more semantic maps and improve localization accuracy in structured environments.
Contribution
It proposes a novel high-order graphical model for joint inference of 3D objects and planes from single images within a SLAM framework, enhancing map density and semantic richness.
Findings
Improves camera localization accuracy without loop closure.
Generates dense, robust maps in structured environments.
Outperforms state-of-the-art SLAM methods on multiple datasets.
Abstract
In this paper, we present a monocular Simultaneous Localization and Mapping (SLAM) algorithm using high-level object and plane landmarks. The built map is denser, more compact and semantic meaningful compared to feature point based SLAM. We first propose a high order graphical model to jointly infer the 3D object and layout planes from single images considering occlusions and semantic constraints. The extracted objects and planes are further optimized with camera poses in a unified SLAM framework. Objects and planes can provide more semantic constraints such as Manhattan plane and object supporting relationships compared to points. Experiments on various public and collected datasets including ICL NUIM and TUM Mono show that our algorithm can improve camera localization accuracy compared to state-of-the-art SLAM especially when there is no loop closure, and also generate dense maps…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
