Monocular Depth Estimation Using Cues Inspired by Biological Vision Systems
Dylan Auty, Krystian Mikolajczyk

TL;DR
This paper introduces a biologically inspired approach to monocular depth estimation that leverages semantic cues, object size priors, and relational information to improve depth prediction accuracy, especially with limited training data.
Contribution
The work explicitly incorporates external semantic, size, and relational cues into depth estimation models, inspired by biological vision systems, to enhance performance and data efficiency.
Findings
Improved depth prediction accuracy on NYUD2 benchmark.
External semantic and size cues enhance model performance.
Method is adaptable to various depth estimation systems.
Abstract
Monocular depth estimation (MDE) aims to transform an RGB image of a scene into a pixelwise depth map from the same camera view. It is fundamentally ill-posed due to missing information: any single image can have been taken from many possible 3D scenes. Part of the MDE task is, therefore, to learn which visual cues in the image can be used for depth estimation, and how. With training data limited by cost of annotation or network capacity limited by computational power, this is challenging. In this work we demonstrate that explicitly injecting visual cue information into the model is beneficial for depth estimation. Following research into biological vision systems, we focus on semantic information and prior knowledge of object sizes and their relations, to emulate the biological cues of relative size, familiar size, and absolute size. We use state-of-the-art semantic and instance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Cell Image Analysis Techniques
