Monocular Depth Estimation From the Perspective of Feature Restoration: A Diffusion Enhanced Depth Restoration Approach

Huibin Bai; Shuai Li; Hanxiao Zhai; Yanbo Gao; Chong Lv; Yibo Wang; Haipeng Ping; Wei Hua; Xingyu Gao

arXiv:2604.07664·cs.CV·April 10, 2026

Monocular Depth Estimation From the Perspective of Feature Restoration: A Diffusion Enhanced Depth Restoration Approach

Huibin Bai, Shuai Li, Hanxiao Zhai, Yanbo Gao, Chong Lv, Yibo Wang, Haipeng Ping, Wei Hua, Xingyu Gao

PDF

1 Repo

TL;DR

This paper introduces a novel feature restoration approach using diffusion models for monocular depth estimation, improving accuracy by leveraging invertible transforms and auxiliary features.

Contribution

It proposes a diffusion-based feature restoration framework with invertible transforms and auxiliary viewpoint enhancement for improved monocular depth estimation.

Findings

01

Achieves better performance than state-of-the-art on multiple datasets.

02

Improves KITTI benchmark RMSE by 4.09% and 37.77%.

03

Demonstrates the effectiveness of feature restoration via diffusion models.

Abstract

Monocular Depth Estimation (MDE) is a fundamental computer vision task with important applications in 3D vision. The current mainstream MDE methods employ an encoder-decoder architecture with multi-level/scale feature processing. However, the limitations of the current architecture and the effects of different-level features on the prediction accuracy are not evaluated. In this paper, we first investigate the above problem and show that there is still substantial potential in the current framework if encoder features can be improved. Therefore, we propose to formulate the depth estimation problem from the feature restoration perspective, by treating pretrained encoder features as degraded features of an assumed ground truth feature that yields the ground truth depth map. Then an Invertible Transform-enhanced Indirect Diffusion (InvT-IndDiffusion) module is developed for feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

whitehb1/IID-RDepth
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.