PatchRefiner V2: Fast and Lightweight Real-Domain High-Resolution Metric Depth Estimation
Zhenyu Li, Wenqing Cui, Shariq Farooq Bhat, Peter Wonka

TL;DR
PatchRefiner V2 is a lightweight, fast high-resolution depth estimation method that uses innovative modules and training strategies to improve accuracy, efficiency, and domain transfer capabilities.
Contribution
The paper introduces a lightweight encoder-based depth estimation framework with a novel C2F module, guided denoising, and SSIGM loss, enhancing speed and accuracy over existing methods.
Findings
Outperforms state-of-the-art methods on UnrealStereo4K in accuracy and speed.
Reduces model size and inference time significantly.
Improves depth boundary delineation on real-world datasets.
Abstract
While current high-resolution depth estimation methods achieve strong results, they often suffer from computational inefficiencies due to reliance on heavyweight models and multiple inference steps, increasing inference time. To address this, we introduce PatchRefiner V2 (PRV2), which replaces heavy refiner models with lightweight encoders. This reduces model size and inference time but introduces noisy features. To overcome this, we propose a Coarse-to-Fine (C2F) module with a Guided Denoising Unit for refining and denoising the refiner features and a Noisy Pretraining strategy to pretrain the refiner branch to fully exploit the potential of the lightweight refiner branch. Additionally, we introduce a Scale-and-Shift Invariant Gradient Matching (SSIGM) loss to enhance synthetic-to-real domain transfer. PRV2 outperforms state-of-the-art depth estimation methods on UnrealStereo4K in both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Advanced Image and Video Retrieval Techniques
