ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

Ruijie Zhu; Chuxin Wang; Ziyang Song; Li Liu; Tianzhu Zhang; Yongdong Zhang

arXiv:2407.08187·cs.CV·February 25, 2026

ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

Ruijie Zhu, Chuxin Wang, Ziyang Song, Li Liu, Tianzhu Zhang, Yongdong Zhang

PDF

1 Repo

TL;DR

ScaleDepth introduces a novel approach to metric depth estimation by decomposing it into scene scale prediction and relative depth estimation, enabling better generalization across diverse scenes without fine-tuning.

Contribution

The paper proposes a new monocular depth estimation method that decomposes metric depth into scene scale and relative depth, with modules for semantic-aware scale prediction and adaptive relative depth estimation.

Findings

01

Achieves state-of-the-art results across indoor and outdoor scenes.

02

Effectively generalizes to unseen scenes without fine-tuning.

03

Handles diverse scene scales in a unified framework.

Abstract

Estimating depth from a single image is a challenging visual task. Compared to relative depth estimation, metric depth estimation attracts more attention due to its practical physical significance and critical applications in real-life scenarios. However, existing metric depth estimation methods are typically trained on specific datasets with similar scenes, facing challenges in generalizing across scenes with significant scale variations. To address this challenge, we propose a novel monocular depth estimation method called ScaleDepth. Our method decomposes metric depth into scene scale and relative depth, and predicts them through a semantic-aware scale prediction (SASP) module and an adaptive relative depth estimation (ARDE) module, respectively. The proposed ScaleDepth enjoys several merits. First, the SASP module can implicitly combine structural and semantic features of the images…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RuijieZhu94/mmdepth/blob/main/projects/ScaleDepth/README.md
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need