ReDepth Anything: Test-Time Depth Refinement via Self-Supervised Re-lighting

Ananta R. Bhattarai; Helge Rhodin

arXiv:2512.17908·cs.CV·March 10, 2026

ReDepth Anything: Test-Time Depth Refinement via Self-Supervised Re-lighting

Ananta R. Bhattarai, Helge Rhodin

PDF

Open Access

TL;DR

ReDepth Anything introduces a test-time self-supervision framework that refines monocular depth estimates by re-lighting and augmenting images, leveraging large-scale diffusion models to improve accuracy and realism without additional training.

Contribution

It presents a novel test-time self-supervision method that fuses foundation models with diffusion priors for depth refinement, avoiding full model fine-tuning and enhancing real-world depth estimation.

Findings

01

Significant improvements in depth accuracy across benchmarks.

02

Enhanced realism in depth maps through re-lighting and augmentation.

03

Achieved state-of-the-art results when combined with Depth Anything 3.

Abstract

Monocular depth estimation remains challenging, as foundation models such as Depth Anything V2 (DA-V2) struggle with real-world images that are far from the training distribution. We introduce Re-Depth Anything, a test-time self-supervision framework that bridges this domain gap by fusing foundation models with the powerful priors of large-scale 2D diffusion models. Our method performs label-free refinement directly on the input image by re-lighting the predicted depth map and augmenting the input. This re-synthesis method replaces classical photometric reconstruction by leveraging shape from shading (SfS) cues in a new, generative context with Score Distillation Sampling (SDS). To prevent optimization collapse, our framework updates only intermediate embeddings and the decoder's weights, rather than optimizing the depth tensor directly or fine-tuning the full model. Across diverse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis