Self-Supervised Geometry-Guided Initialization for Robust Monocular Visual Odometry
Takayuki Kanai, Igor Vasiljevic, Vitor Guizilini, Kazuhiro Shintani

TL;DR
This paper introduces a self-supervised, geometry-guided initialization method for monocular visual odometry that enhances robustness and accuracy, especially in challenging outdoor scenarios with large motions and dynamic objects.
Contribution
It proposes leveraging a frozen large-scale pre-trained monocular depth estimator to improve dense SLAM initialization without additional fine-tuning.
Findings
Significant improvements on KITTI odometry benchmark.
Enhanced robustness in large motion and dynamic object scenarios.
Effective initialization method for dense SLAM models.
Abstract
Monocular visual odometry is a key technology in various autonomous systems. Traditional feature-based methods suffer from failures due to poor lighting, insufficient texture, and large motions. In contrast, recent learning-based dense SLAM methods exploit iterative dense bundle adjustment to address such failure cases, and achieve robust and accurate localization in a wide variety of real environments, without depending on domain-specific supervision. However, despite its potential, the methods still struggle with scenarios involving large motion and object dynamics. In this study, we diagnose key weaknesses in a popular learning-based dense SLAM model (DROID-SLAM) by analyzing major failure cases on outdoor benchmarks and exposing various shortcomings of its optimization process. We then propose the use of self-supervised priors leveraging a frozen large-scale pre-trained monocular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Image and Object Detection Techniques
