MegaDepth: Learning Single-View Depth Prediction from Internet Photos

Zhengqi Li; Noah Snavely

arXiv:1804.00607·cs.CV·November 29, 2018·65 cites

MegaDepth: Learning Single-View Depth Prediction from Internet Photos

Zhengqi Li, Noah Snavely

PDF

Open Access 2 Repos

TL;DR

This paper introduces MegaDepth, a large-scale dataset for single-view depth prediction created from Internet photo collections using structure-from-motion and multi-view stereo, enabling models to generalize well across diverse scenes.

Contribution

The paper presents MegaDepth, a novel large-scale depth dataset from Internet photos, and demonstrates its effectiveness for training models with strong cross-dataset generalization.

Findings

01

Models trained on MegaDepth generalize well to other datasets.

02

MegaDepth dataset overcomes limitations of existing datasets.

03

Data cleaning and augmentation improve depth prediction accuracy.

Abstract

Single-view depth prediction is a fundamental problem in computer vision. Recently, deep learning methods have led to significant progress, but such methods are limited by the available training data. Current datasets based on 3D sensors have key limitations, including indoor-only images (NYU), small numbers of training examples (Make3D), and sparse sampling (KITTI). We propose to use multi-view Internet photo collections, a virtually unlimited data source, to generate training data via modern structure-from-motion and multi-view stereo (MVS) methods, and present a large depth dataset called MegaDepth based on this idea. Data derived from MVS comes with its own challenges, including noise and unreconstructable objects. We address these challenges with new data cleaning methods, as well as automatically augmenting our data with ordinal depth relations generated using semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Robotics and Sensor-Based Localization