Recurrent Neural Network for (Un-)supervised Learning of Monocular VideoVisual Odometry and Depth
Rui Wang, Stephen M. Pizer, Jan-Michael Frahm

TL;DR
This paper introduces a recurrent neural network approach for monocular video-based depth and odometry estimation that leverages motion information, achieving superior results in both supervised and unsupervised settings on the KITTI dataset.
Contribution
It presents a novel RNN-based model that incorporates motion cues for depth and odometry estimation, capable of training in supervised or unsupervised modes, and generalizes from multi-view to single-view depth estimation.
Findings
Outperforms state-of-the-art methods on KITTI dataset
Effective in both supervised and unsupervised training modes
Generalizes well from multi-view to single-view depth estimation
Abstract
Deep learning-based, single-view depth estimation methods have recently shown highly promising results. However, such methods ignore one of the most important features for determining depth in the human vision system, which is motion. We propose a learning-based, multi-view dense depth map and odometry estimation method that uses Recurrent Neural Networks (RNN) and trains utilizing multi-view image reprojection and forward-backward flow-consistency losses. Our model can be trained in a supervised or even unsupervised mode. It is designed for depth and visual odometry estimation from video where the input frames are temporally correlated. However, it also generalizes to single-view depth estimation. Our method produces superior results to the state-of-the-art approaches for single-view and multi-view learning-based depth estimation on the KITTI driving dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Image Processing Techniques and Applications
