MDN-VO: Estimating Visual Odometry with Confidence
Nimet Kaygusuz, Oscar Mendez, Richard Bowden

TL;DR
This paper introduces MDN-VO, a deep learning model that estimates visual odometry and its confidence, outperforming traditional methods and effectively detecting failure cases through uncertainty estimation.
Contribution
The paper presents a novel CNN-RNN hybrid model with a Mixture Density Network for joint pose estimation and uncertainty modeling in visual odometry.
Findings
Outperforms state-of-the-art VO methods on KITTI and nuScenes datasets.
Effectively detects failure cases using predicted pose uncertainty.
Provides both pose estimates and confidence levels in an unsupervised manner.
Abstract
Visual Odometry (VO) is used in many applications including robotics and autonomous systems. However, traditional approaches based on feature matching are computationally expensive and do not directly address failure cases, instead relying on heuristic methods to detect failure. In this work, we propose a deep learning-based VO model to efficiently estimate 6-DoF poses, as well as a confidence model for these estimates. We utilise a CNN - RNN hybrid model to learn feature representations from image sequences. We then employ a Mixture Density Network (MDN) which estimates camera motion as a mixture of Gaussians, based on the extracted spatio-temporal representations. Our model uses pose labels as a source of supervision, but derives uncertainties in an unsupervised manner. We evaluate the proposed model on the KITTI and nuScenes datasets and report extensive quantitative and qualitative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
