TL;DR
DepthMaster introduces a single-step diffusion model with feature alignment and Fourier enhancement modules for improved monocular depth estimation, balancing global structure and fine details.
Contribution
It proposes a novel single-step diffusion approach with modules for semantic feature alignment and Fourier-based detail enhancement, advancing depth estimation accuracy and efficiency.
Findings
Achieves state-of-the-art generalization in depth estimation.
Effectively balances low-frequency structure and high-frequency details.
Outperforms existing diffusion-based methods across datasets.
Abstract
Monocular depth estimation within the diffusion-denoising paradigm demonstrates impressive generalization ability but suffers from low inference speed. Recent methods adopt a single-step deterministic paradigm to improve inference efficiency while maintaining comparable performance. However, they overlook the gap between generative and discriminative features, leading to suboptimal results. In this work, we propose DepthMaster, a single-step diffusion model designed to adapt generative features for the discriminative depth estimation task. First, to mitigate overfitting to texture details introduced by generative features, we propose a Feature Alignment module, which incorporates high-quality semantic features to enhance the denoising network's representation capability. Second, to address the lack of fine-grained details in the single-step deterministic framework, we propose a Fourier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
