ReDepth Anything: Test-Time Depth Refinement via Self-Supervised Re-lighting
Ananta R. Bhattarai, Helge Rhodin

TL;DR
ReDepth Anything introduces a test-time self-supervision framework that refines monocular depth estimates by re-lighting and augmenting images, leveraging large-scale diffusion models to improve accuracy and realism without additional training.
Contribution
It presents a novel test-time self-supervision method that fuses foundation models with diffusion priors for depth refinement, avoiding full model fine-tuning and enhancing real-world depth estimation.
Findings
Significant improvements in depth accuracy across benchmarks.
Enhanced realism in depth maps through re-lighting and augmentation.
Achieved state-of-the-art results when combined with Depth Anything 3.
Abstract
Monocular depth estimation remains challenging, as foundation models such as Depth Anything V2 (DA-V2) struggle with real-world images that are far from the training distribution. We introduce Re-Depth Anything, a test-time self-supervision framework that bridges this domain gap by fusing foundation models with the powerful priors of large-scale 2D diffusion models. Our method performs label-free refinement directly on the input image by re-lighting the predicted depth map and augmenting the input. This re-synthesis method replaces classical photometric reconstruction by leveraging shape from shading (SfS) cues in a new, generative context with Score Distillation Sampling (SDS). To prevent optimization collapse, our framework updates only intermediate embeddings and the decoder's weights, rather than optimizing the depth tensor directly or fine-tuning the full model. Across diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis
