OASIS-DC: Generalizable Depth Completion via Output-level Alignment of Sparse-Integrated Monocular Pseudo Depth

Jaehyeon Cho; Jhonghyun An

arXiv:2602.01268·cs.CV·February 3, 2026

OASIS-DC: Generalizable Depth Completion via Output-level Alignment of Sparse-Integrated Monocular Pseudo Depth

Jaehyeon Cho, Jhonghyun An

PDF

Open Access

TL;DR

This paper introduces a method to convert relative depth estimates from monocular foundation models into metric depth using sparse range measurements, enabling accurate depth completion with minimal labeled data for robotics applications.

Contribution

It proposes a novel calibration approach that combines foundation model priors with sparse anchors, improving depth completion accuracy in low-label scenarios.

Findings

01

Effective depth completion with few labeled samples

02

Stable scale and sharp edges in predictions

03

Robust performance without curated validation data

Abstract

Recent monocular foundation models excel at zero-shot depth estimation, yet their outputs are inherently relative rather than metric, limiting direct use in robotics and autonomous driving. We leverage the fact that relative depth preserves global layout and boundaries: by calibrating it with sparse range measurements, we transform it into a pseudo metric depth prior. Building on this prior, we design a refinement network that follows the prior where reliable and deviates where necessary, enabling accurate metric predictions from very few labeled samples. The resulting system is particularly effective when curated validation data are unavailable, sustaining stable scale and sharp edges across few-shot regimes. These findings suggest that coupling foundation priors with sparse anchors is a practical route to robust, deployment-ready depth completion under real-world label scarcity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Speech and Audio Processing · Optical measurement and interference techniques