Rethinking Monocular Depth Estimation with Adversarial Training

Richard Chen; Faisal Mahmood; Alan Yuille; Nicholas J. Durr

arXiv:1808.07528·cs.CV·June 18, 2019·39 cites

Rethinking Monocular Depth Estimation with Adversarial Training

Richard Chen, Faisal Mahmood, Alan Yuille, Nicholas J. Durr

PDF

Open Access

TL;DR

This paper introduces a novel adversarial training framework for monocular depth estimation that leverages a non-local, patch-level loss to incorporate global context, achieving state-of-the-art results.

Contribution

It proposes a context-aware, adversarial loss function for depth estimation, stabilized with spectral normalization, and demonstrates improved performance over existing methods.

Findings

01

Reduces relative error significantly on NYUv2, Make3D, and KITTI datasets.

02

Achieves state-of-the-art performance in monocular depth estimation.

03

Incorporates global context through patch-level adversarial loss.

Abstract

Monocular depth estimation is an extensively studied computer vision problem with a vast variety of applications. Deep learning-based methods have demonstrated promise for both supervised and unsupervised depth estimation from monocular images. Most existing approaches treat depth estimation as a regression problem with a local pixel-wise loss function. In this work, we innovate beyond existing approaches by using adversarial training to learn a context-aware, non-local loss function. Such an approach penalizes the joint configuration of predicted depth values at the patch-level instead of the pixel-level, which allows networks to incorporate more global information. In this framework, the generator learns a mapping between RGB images and its corresponding depth map, while the discriminator learns to distinguish depth map and RGB pairs from ground truth. This conditional GAN depth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques

MethodsConcatenated Skip Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · U-Net · Convolution · Spectral Normalization · Dogecoin Customer Service Number +1-833-534-1729