Rethinking Monocular Depth Estimation with Adversarial Training
Richard Chen, Faisal Mahmood, Alan Yuille, Nicholas J. Durr

TL;DR
This paper introduces a novel adversarial training framework for monocular depth estimation that leverages a non-local, patch-level loss to incorporate global context, achieving state-of-the-art results.
Contribution
It proposes a context-aware, adversarial loss function for depth estimation, stabilized with spectral normalization, and demonstrates improved performance over existing methods.
Findings
Reduces relative error significantly on NYUv2, Make3D, and KITTI datasets.
Achieves state-of-the-art performance in monocular depth estimation.
Incorporates global context through patch-level adversarial loss.
Abstract
Monocular depth estimation is an extensively studied computer vision problem with a vast variety of applications. Deep learning-based methods have demonstrated promise for both supervised and unsupervised depth estimation from monocular images. Most existing approaches treat depth estimation as a regression problem with a local pixel-wise loss function. In this work, we innovate beyond existing approaches by using adversarial training to learn a context-aware, non-local loss function. Such an approach penalizes the joint configuration of predicted depth values at the patch-level instead of the pixel-level, which allows networks to incorporate more global information. In this framework, the generator learns a mapping between RGB images and its corresponding depth map, while the discriminator learns to distinguish depth map and RGB pairs from ground truth. This conditional GAN depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques
MethodsConcatenated Skip Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · U-Net · Convolution · Spectral Normalization · Dogecoin Customer Service Number +1-833-534-1729
