Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues

Chinmay Talegaonkar; Nikhil Gandudi Suresh; Zachary Novack; Yash Belhe; Priyanka Nagasamudra; Nicholas Antipa

arXiv:2505.17358·cs.CV·May 26, 2025

Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues

Chinmay Talegaonkar, Nikhil Gandudi Suresh, Zachary Novack, Yash Belhe, Priyanka Nagasamudra, Nicholas Antipa

PDF

1 Video

TL;DR

This paper introduces a training-free method that enhances a pre-trained diffusion model to perform zero-shot metric depth estimation by incorporating defocus blur cues from image pairs with different apertures.

Contribution

It presents a novel approach to turn a pre-trained diffusion model into a metric depth predictor using defocus cues without additional training.

Findings

01

Outperforms existing zero-shot MMDE methods on real datasets.

02

Effectively incorporates defocus cues at inference time.

03

Improves depth estimation accuracy and generalization.

Abstract

Recent monocular metric depth estimation (MMDE) methods have made notable progress towards zero-shot generalization. However, they still exhibit a significant performance drop on out-of-distribution datasets. We address this limitation by injecting defocus blur cues at inference time into Marigold, a \textit{pre-trained} diffusion model for zero-shot, scale-invariant monocular depth estimation (MDE). Our method effectively turns Marigold into a metric depth predictor in a training-free manner. To incorporate defocus cues, we capture two images with a small and a large aperture from the same viewpoint. To recover metric depth, we then optimize the metric depth scaling parameters and the noise latents of Marigold at inference time using gradients from a loss function based on the defocus-blur image formation model. We compare our method against existing state-of-the-art zero-shot MMDE…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues· slideslive