MegaDepth: Learning Single-View Depth Prediction from Internet Photos
Zhengqi Li, Noah Snavely

TL;DR
This paper introduces MegaDepth, a large-scale dataset for single-view depth prediction created from Internet photo collections using structure-from-motion and multi-view stereo, enabling models to generalize well across diverse scenes.
Contribution
The paper presents MegaDepth, a novel large-scale depth dataset from Internet photos, and demonstrates its effectiveness for training models with strong cross-dataset generalization.
Findings
Models trained on MegaDepth generalize well to other datasets.
MegaDepth dataset overcomes limitations of existing datasets.
Data cleaning and augmentation improve depth prediction accuracy.
Abstract
Single-view depth prediction is a fundamental problem in computer vision. Recently, deep learning methods have led to significant progress, but such methods are limited by the available training data. Current datasets based on 3D sensors have key limitations, including indoor-only images (NYU), small numbers of training examples (Make3D), and sparse sampling (KITTI). We propose to use multi-view Internet photo collections, a virtually unlimited data source, to generate training data via modern structure-from-motion and multi-view stereo (MVS) methods, and present a large depth dataset called MegaDepth based on this idea. Data derived from MVS comes with its own challenges, including noise and unreconstructable objects. We address these challenges with new data cleaning methods, as well as automatically augmenting our data with ordinal depth relations generated using semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Robotics and Sensor-Based Localization
