DepthCues: Evaluating Monocular Depth Perception in Large Vision Models
Duolikun Danier, Mehmet Ayg\"un, Changjian Li, Hakan Bilen, Oisin Mac, Aodha

TL;DR
This paper investigates how monocular depth cues emerge in large pre-trained vision models without explicit depth supervision, introduces a new benchmark for evaluation, and demonstrates that fine-tuning can enhance depth perception.
Contribution
The paper introduces the DepthCues benchmark to evaluate depth perception in vision models and analyzes the emergence of human-like depth cues in large models without explicit supervision.
Findings
Depth cues emerge more prominently in larger models.
Fine-tuning on DepthCues improves depth estimation.
The benchmark and code are publicly available.
Abstract
Large-scale pre-trained vision models are becoming increasingly prevalent, offering expressive and generalizable visual representations that benefit various downstream tasks. Recent studies on the emergent properties of these models have revealed their high-level geometric understanding, in particular in the context of depth perception. However, it remains unclear how depth perception arises in these models without explicit depth supervision provided during pre-training. To investigate this, we examine whether the monocular depth cues, similar to those used by the human visual system, emerge in these models. We introduce a new benchmark, DepthCues, designed to evaluate depth cue understanding, and present findings across 20 diverse and representative pre-trained vision models. Our analysis shows that human-like depth cues emerge in more recent larger models. We also explore enhancing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Advanced Vision and Imaging · Color Science and Applications
