End-to-end depth from motion with stabilized monocular videos

Cl\'ement Pinard; Laure Chevalley; Antoine Manzanera; David; Filliat

arXiv:1809.04453·cs.CV·September 13, 2018

End-to-end depth from motion with stabilized monocular videos

Cl\'ement Pinard, Laure Chevalley, Antoine Manzanera, David, Filliat

PDF

TL;DR

This paper introduces a monocular video-based depth inference system using a new dataset that simulates stabilized aerial footage, demonstrating effective depth prediction in rigid scenes with a fully convolutional network.

Contribution

The paper presents a novel dataset and an end-to-end convolutional architecture for depth inference from stabilized monocular videos, simplifying the structure from motion problem.

Findings

01

Effective depth prediction in stabilized monocular videos

02

Locally solvable problem tied to camera parameters

03

Good quality depth maps achieved

Abstract

We propose a depth map inference system from monocular videos based on a novel dataset for navigation that mimics aerial footage from gimbal stabilized monocular camera in rigid scenes. Unlike most navigation datasets, the lack of rotation implies an easier structure from motion problem which can be leveraged for different kinds of tasks such as depth inference and obstacle avoidance. We also propose an architecture for end-to-end depth inference with a fully convolutional network. Results show that although tied to camera inner parameters, the problem is locally solvable and leads to good quality depth prediction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.