OASIS-DC: Generalizable Depth Completion via Output-level Alignment of Sparse-Integrated Monocular Pseudo Depth
Jaehyeon Cho, Jhonghyun An

TL;DR
This paper introduces a method to convert relative depth estimates from monocular foundation models into metric depth using sparse range measurements, enabling accurate depth completion with minimal labeled data for robotics applications.
Contribution
It proposes a novel calibration approach that combines foundation model priors with sparse anchors, improving depth completion accuracy in low-label scenarios.
Findings
Effective depth completion with few labeled samples
Stable scale and sharp edges in predictions
Robust performance without curated validation data
Abstract
Recent monocular foundation models excel at zero-shot depth estimation, yet their outputs are inherently relative rather than metric, limiting direct use in robotics and autonomous driving. We leverage the fact that relative depth preserves global layout and boundaries: by calibrating it with sparse range measurements, we transform it into a pseudo metric depth prior. Building on this prior, we design a refinement network that follows the prior where reliable and deviates where necessary, enabling accurate metric predictions from very few labeled samples. The resulting system is particularly effective when curated validation data are unavailable, sustaining stable scale and sharp edges across few-shot regimes. These findings suggest that coupling foundation priors with sparse anchors is a practical route to robust, deployment-ready depth completion under real-world label scarcity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Speech and Audio Processing · Optical measurement and interference techniques
