Deep Online Correction for Monocular Visual Odometry
Jiaxin Zhang, Wei Sui, Xinggang Wang, Wenming Meng, Hongmei Zhu, Qian, Zhang

TL;DR
This paper introduces a deep online correction framework for monocular visual odometry that improves pose estimation accuracy by refining CNN-predicted poses through gradient-based photometric error minimization during inference, without retraining CNNs.
Contribution
The proposed DOC framework enhances monocular visual odometry by combining CNN-based initial pose estimation with online pose refinement, avoiding complex back-end optimization and reducing computational costs.
Findings
Achieves 2.0% RTE on KITTI Seq. 09, outperforming traditional methods.
Fully relies on deep learning without complex back-end modules.
Comparable to hybrid methods in accuracy.
Abstract
In this work, we propose a novel deep online correction (DOC) framework for monocular visual odometry. The whole pipeline has two stages: First, depth maps and initial poses are obtained from convolutional neural networks (CNNs) trained in self-supervised manners. Second, the poses predicted by CNNs are further improved by minimizing photometric errors via gradient updates of poses during inference phases. The benefits of our proposed method are twofold: 1) Different from online-learning methods, DOC does not need to calculate gradient propagation for parameters of CNNs. Thus, it saves more computation resources during inference phases. 2) Unlike hybrid methods that combine CNNs with traditional methods, DOC fully relies on deep learning (DL) frameworks. Though without complex back-end optimization modules, our method achieves outstanding performance with relative transform error (RTE)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
