Aperture Supervision for Monocular Depth Estimation

Pratul P. Srinivasan; Rahul Garg; Neal Wadhwa; Ren Ng; Jonathan T.; Barron

arXiv:1711.07933·cs.CV·March 30, 2018

Aperture Supervision for Monocular Depth Estimation

Pratul P. Srinivasan, Rahul Garg, Neal Wadhwa, Ren Ng, Jonathan T., Barron

PDF

TL;DR

This paper introduces a novel approach for monocular depth estimation that leverages camera aperture effects as supervision, using differentiable aperture rendering to train models without requiring depth sensors or multiple viewpoints.

Contribution

The method uses aperture effects as supervision for depth estimation, introducing differentiable aperture rendering functions to enable end-to-end training from single images.

Findings

01

Achieves accurate depth estimation using aperture effects as supervision.

02

Eliminates the need for depth sensors or multi-view images.

03

Demonstrates effectiveness on benchmark datasets.

Abstract

We present a novel method to train machine learning algorithms to estimate scene depths from a single image, by using the information provided by a camera's aperture as supervision. Prior works use a depth sensor's outputs or images of the same scene from alternate viewpoints as supervision, while our method instead uses images from the same viewpoint taken with a varying camera aperture. To enable learning algorithms to use aperture effects as supervision, we introduce two differentiable aperture rendering functions that use the input image and predicted depths to simulate the depth-of-field effects caused by real camera apertures. We train a monocular depth estimation network end-to-end to predict the scene depths that best explain these finite aperture images as defocus-blurred renderings of the input all-in-focus image.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.