Learning Image-Adaptive Scale Fields for Metric Depth Recovery
Yuanyan Li, Matthias Althoff

TL;DR
This paper introduces a novel method for improving metric depth recovery in monocular depth estimation by modeling image-adaptive scale fields, leading to more accurate and robust depth estimations especially with sparse anchors.
Contribution
It formulates depth correction as a low-dimensional linear combination of image-adaptive basis maps derived from semantic and geometric cues, enhancing accuracy and robustness.
Findings
Improved metric depth accuracy across multiple datasets.
Robustness under extreme anchor sparsity.
Interpretable decomposition of spatial scale variations.
Abstract
Monocular depth estimation (MDE) typically produces depth estimations that are defined up to an unknown scale or shift. When only sparse metric anchors are available, recovering accurate metric depth becomes challenging yet necessary for practical applications. We address this problem by formulating metric depth recovery as image-adaptive scale field modeling. Instead of directly correcting the depth, we reformulate the correction as a low-dimensional linear combination of image-adaptive basis maps. These maps are derived from semantic and geometric cues encoded in the MDE estimations and intermediate representations. The weights of basis maps are efficiently determined from sparse metric anchors via a least-squares problem. This formulation yields improved metric depth accuracy, strong robustness under extreme anchor sparsity, and an interpretable decomposition of spatial scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
