# Depth from Videos in the Wild: Unsupervised Monocular Depth Learning   from Unknown Cameras

**Authors:** Ariel Gordon, Hanhan Li, Rico Jonschkowski, Anelia Angelova

arXiv: 1904.04998 · 2019-10-31

## TL;DR

This paper introduces an unsupervised method for learning depth, camera parameters, and motion from monocular videos, including unknown camera intrinsics, achieving state-of-the-art results on multiple datasets.

## Contribution

It is the first to learn camera intrinsics, including lens distortion, from unlabeled videos, improving depth and motion estimation from diverse, real-world footage.

## Key findings

- Achieved new state-of-the-art depth prediction on Cityscapes, KITTI, and EuRoC datasets.
- Successfully learned depth and odometry from YouTube videos without supervision.
- Introduced a novel regularizer and geometric occlusion handling for improved accuracy.

## Abstract

We present a novel method for simultaneous learning of depth, egomotion, object motion, and camera intrinsics from monocular videos, using only consistency across neighboring video frames as supervision signal. Similarly to prior work, our method learns by applying differentiable warping to frames and comparing the result to adjacent ones, but it provides several improvements: We address occlusions geometrically and differentiably, directly using the depth maps as predicted during training. We introduce randomized layer normalization, a novel powerful regularizer, and we account for object motion relative to the scene. To the best of our knowledge, our work is the first to learn the camera intrinsic parameters, including lens distortion, from video in an unsupervised manner, thereby allowing us to extract accurate depth and motion from arbitrary videos of unknown origin at scale. We evaluate our results on the Cityscapes, KITTI and EuRoC datasets, establishing new state of the art on depth prediction and odometry, and demonstrate qualitatively that depth prediction can be learned from a collection of YouTube videos.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.04998/full.md

## Figures

51 figures with captions in the complete paper: https://tomesphere.com/paper/1904.04998/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/1904.04998/full.md

---
Source: https://tomesphere.com/paper/1904.04998