Score: 1

Re-Depth Anything: Test-Time Depth Refinement via Self-Supervised Re-lighting

Published: December 19, 2025 | arXiv ID: 2512.17908v1

By: Ananta R. Bhattarai, Helge Rhodin

Potential Business Impact:

Makes AI better at guessing distances in photos.

Business Areas:
Image Recognition Data and Analytics, Software

Monocular depth estimation remains challenging as recent foundation models, such as Depth Anything V2 (DA-V2), struggle with real-world images that are far from the training distribution. We introduce Re-Depth Anything, a test-time self-supervision framework that bridges this domain gap by fusing DA-V2 with the powerful priors of large-scale 2D diffusion models. Our method performs label-free refinement directly on the input image by re-lighting predicted depth maps and augmenting the input. This re-synthesis method replaces classical photometric reconstruction by leveraging shape from shading (SfS) cues in a new, generative context with Score Distillation Sampling (SDS). To prevent optimization collapse, our framework employs a targeted optimization strategy: rather than optimizing depth directly or fine-tuning the full model, we freeze the encoder and only update intermediate embeddings while also fine-tuning the decoder. Across diverse benchmarks, Re-Depth Anything yields substantial gains in depth accuracy and realism over the DA-V2, showcasing new avenues for self-supervision by augmenting geometric reasoning.

Page Count
19 pages

Category
Computer Science:
CV and Pattern Recognition