# DistanceNet: Estimating Traveled Distance from Monocular Images using a   Recurrent Convolutional Neural Network

**Authors:** Robin Kreuzig, Matthias Ochs, Rudolf Mester

arXiv: 1904.08105 · 2019-04-18

## TL;DR

DistanceNet is a novel deep recurrent neural network that estimates traveled distance from monocular images, addressing scale ambiguity in visual SLAM/VO by learning geometric and temporal features.

## Contribution

It introduces an end-to-end RCNN model that predicts traveled distance using ordinal regression, outperforming existing methods on the KITTI dataset.

## Key findings

- Outperforms state-of-the-art deep learning pose estimators
- Surpasses classical monocular SLAM/VO methods in distance prediction
- Effective in addressing scale ambiguity in monocular visual odometry

## Abstract

Classical monocular vSLAM/VO methods suffer from the scale ambiguity problem. Hybrid approaches solve this problem by adding deep learning methods, for example by using depth maps which are predicted by a CNN. We suggest that it is better to base scale estimation on estimating the traveled distance for a set of subsequent images. In this paper, we propose a novel end-to-end many-to-one traveled distance estimator. By using a deep recurrent convolutional neural network (RCNN), the traveled distance between the first and last image of a set of consecutive frames is estimated by our DistanceNet. Geometric features are learned in the CNN part of our model, which are subsequently used by the RNN to learn dynamics and temporal information. Moreover, we exploit the natural order of distances by using ordinal regression to predict the distance. The evaluation on the KITTI dataset shows that our approach outperforms current state-of-the-art deep learning pose estimators and classical mono vSLAM/VO methods in terms of distance prediction. Thus, our DistanceNet can be used as a component to solve the scale problem and help improve current and future classical mono vSLAM/VO methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.08105/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1904.08105/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1904.08105/full.md

---
Source: https://tomesphere.com/paper/1904.08105