Accuracy Does Not Guarantee Human-Likeness in Monocular Depth Estimators
Yuki Kubota, Taiki Fukiage

TL;DR
This paper investigates the relationship between model accuracy and human-likeness in monocular depth estimation, revealing that higher accuracy does not always equate to more human-like perception, emphasizing the need for human-centric evaluation methods.
Contribution
The study systematically analyzes the correlation between accuracy and human similarity across 69 depth estimators, highlighting the divergence and trade-offs between these aspects.
Findings
Shared estimation biases between humans and DNNs
Trade-off between accuracy and human-likeness in models
Accuracy improvements do not guarantee human-like behavior
Abstract
Monocular depth estimation is a fundamental capability for real-world applications such as autonomous driving and robotics. Although deep neural networks (DNNs) have achieved superhuman accuracy on physical-based benchmarks, a key challenge remains: aligning model representations with human perception, a promising strategy for enhancing model robustness and interpretability. Research in object recognition has revealed a complex trade-off between model accuracy and human-like behavior, raising a question whether a similar divergence exist in depth estimation, particularly for natural outdoor scenes where benchmarks rely on sensor-based ground truth rather than human perceptual estimates. In this study, we systematically investigated the relationship between model accuracy and human similarity across 69 monocular depth estimators using the KITTI dataset. To dissect the structure of error…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Advanced Vision and Imaging
