Monocular Depth Estimation with Augmented Ordinal Depth Relationships
Yuanzhouhan Cao, Tianqi Zhao, Ke Xian, Chunhua Shen, Zhiguo Cao,, Shugong Xu

TL;DR
This paper introduces a novel approach to monocular depth estimation that leverages relative depth cues from stereo videos, using a new dataset and a classification-based training scheme to improve accuracy and confidence estimation.
Contribution
It presents a new dataset with relative depths from stereo videos and a classification-based training method for monocular depth estimation that achieves state-of-the-art results.
Findings
Achieved state-of-the-art performance on indoor and outdoor RGB-D benchmarks.
Demonstrated the effectiveness of relative depth cues from stereo videos.
Proposed an information gain loss to improve depth prediction confidence.
Abstract
Most existing algorithms for depth estimation from single monocular images need large quantities of metric groundtruth depths for supervised learning. We show that relative depth can be an informative cue for metric depth estimation and can be easily obtained from vast stereo videos. Acquiring metric depths from stereo videos is sometimes impracticable due to the absence of camera parameters. In this paper, we propose to improve the performance of metric depth estimation with relative depths collected from stereo movie videos using existing stereo matching algorithm. We introduce a new "Relative Depth in Stereo" (RDIS) dataset densely labelled with relative depths. We first pretrain a ResNet model on our RDIS dataset. Then we finetune the model on RGB-D datasets with metric ground-truth depths. During our finetuning, we formulate depth estimation as a classification task. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques
MethodsAverage Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization · Max Pooling · Residual Connection
