Real-Time Monocular Object-Model Aware Sparse SLAM
Mehdi Hosseinzadeh, Kejie Li, Yasir Latif, Ian Reid

TL;DR
This paper presents a real-time monocular SLAM system that integrates semantic object detection and planar structures to produce semantically enriched maps and improve localization accuracy.
Contribution
It introduces a novel method combining deep learning-based object and plane detection with SLAM, enabling real-time semantic mapping with refined object shapes and structural scene understanding.
Findings
Semantic objects and planes improve map richness.
Enhanced localization accuracy demonstrated.
Real-time performance achieved.
Abstract
Simultaneous Localization And Mapping (SLAM) is a fundamental problem in mobile robotics. While sparse point-based SLAM methods provide accurate camera localization, the generated maps lack semantic information. On the other hand, state of the art object detection methods provide rich information about entities present in the scene from a single image. This work incorporates a real-time deep-learned object detector to the monocular SLAM framework for representing generic objects as quadrics that permit detections to be seamlessly integrated while allowing the real-time performance. Finer reconstruction of an object, learned by a CNN network, is also incorporated and provides a shape prior for the quadric leading further refinement. To capture the dominant structure of the scene, additional planar landmarks are detected by a CNN-based plane detector and modeled as independent landmarks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
