Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs

Xinqing Guo; Zhang Chen; Siyuan Li; Yang Yang; Jingyi Yu

arXiv:1711.10729·cs.CV·August 11, 2020·2 cites

Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs

Xinqing Guo, Zhang Chen, Siyuan Li, Yang Yang, Jingyi Yu

PDF

Open Access

TL;DR

This paper introduces a unified deep learning framework that combines binocular stereo and monocular focus cues from focal stacks to improve 3D depth perception, outperforming existing methods in accuracy and speed.

Contribution

It presents a novel integrated neural network architecture that simultaneously leverages focus and stereo cues for depth inference from focal stack pairs.

Findings

01

Outperforms state-of-the-art in accuracy and speed

02

Effectively emulates human visual perception

03

Provides high-quality depth maps from focal stacks

Abstract

Human visual system relies on both binocular stereo cues and monocular focusness cues to gain effective 3D perception. In computer vision, the two problems are traditionally solved in separate tracks. In this paper, we present a unified learning-based technique that simultaneously uses both types of cues for depth inference. Specifically, we use a pair of focal stacks as input to emulate human perception. We first construct a comprehensive focal stack training dataset synthesized by depth-guided light field rendering. We then construct three individual networks: a Focus-Net to extract depth from a single focal stack, a EDoF-Net to obtain the extended depth of field (EDoF) image from the focal stack, and a Stereo-Net to conduct stereo matching. We show how to integrate them into a unified BDfF-Net to obtain high-quality depth maps. Comprehensive experiments show that our approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Optical measurement and interference techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings