DeepVO: A Deep Learning approach for Monocular Visual Odometry

Vikram Mohanty; Shubh Agrawal; Shaswat Datta; Arna Ghosh; Vishnu Dutt; Sharma; Debashish Chakravarty

arXiv:1611.06069·cs.CV·November 21, 2016·45 cites

DeepVO: A Deep Learning approach for Monocular Visual Odometry

Vikram Mohanty, Shubh Agrawal, Shaswat Datta, Arna Ghosh, Vishnu Dutt, Sharma, Debashish Chakravarty

PDF

Open Access

TL;DR

This paper introduces a deep learning framework for monocular visual odometry, replacing traditional feature-based methods, and demonstrates promising real-time pose estimation results in known environments.

Contribution

It proposes a CNN architecture tailored for monocular visual odometry, highlighting its effectiveness in estimating motion and scale without additional sensors.

Findings

01

Deep learning can effectively estimate camera motion from monocular images.

02

The proposed CNN performs well in known environments for real-time pose estimation.

03

Pre-trained features influence the network's ability to estimate motion accurately.

Abstract

Deep Learning based techniques have been adopted with precision to solve a lot of standard computer vision problems, some of which are image classification, object detection and segmentation. Despite the widespread success of these approaches, they have not yet been exploited largely for solving the standard perception related problems encountered in autonomous navigation such as Visual Odometry (VO), Structure from Motion (SfM) and Simultaneous Localization and Mapping (SLAM). This paper analyzes the problem of Monocular Visual Odometry using a Deep Learning-based framework, instead of the regular 'feature detection and tracking' pipeline approaches. Several experiments were performed to understand the influence of a known/unknown environment, a conventional trackable feature and pre-trained activations tuned for object classification on the network's ability to accurately estimate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques