Bridging the Ex-Vivo to In-Vivo Gap: Synthetic Priors for Monocular Depth Estimation in Specular Surgical Environments

Ankan Aich; Emma D. Ryan; Kris Moe; Isaac Schmale; Li-Xing Man; and Yangming Lee

arXiv:2512.23786·cs.CV·April 21, 2026

Bridging the Ex-Vivo to In-Vivo Gap: Synthetic Priors for Monocular Depth Estimation in Specular Surgical Environments

Ankan Aich, Emma D. Ryan, Kris Moe, Isaac Schmale, Li-Xing Man, and Yangming Lee

PDF

TL;DR

This paper introduces a novel method for monocular depth estimation in surgical environments, using synthetic priors and domain adaptation to bridge the gap between ex-vivo and in-vivo settings, achieving state-of-the-art results.

Contribution

It leverages synthetic depth priors with domain adaptation to improve in-vivo surgical depth estimation and introduces a new real-surgery validation dataset.

Findings

01

Achieves state-of-the-art on the SCARED dataset.

02

Reduces Squared Relative Error by over 17% in high-specularity regimes.

03

Demonstrates superior robustness on the ROCAL-T 90 dataset.

Abstract

Accurate Monocular Depth Estimation (MDE) is critical for autonomous robotic surgery. However, existing self-supervised methods often exhibit a severe "ex-vivo to in-vivo gap": they achieve high accuracy on public datasets but struggle in actual clinical deployments. This disparity arises because the severe specular reflections and fluid-filled deformations inherent to real surgeries. Models trained on noisy real-world pseudo-labels consequently suffer from severe boundary collapse. To address this, we leverage the high-fidelity synthetic priors of the \textit{Depth Anything V2} architecture, which inherently capture precise geometric details, and efficiently adapt them to the medical domain using Dynamic Vector Low-Rank Adaptation (DV-LORA). Our contributions are two-fold. Technically, our approach establishes a new state-of-the-art on the public SCARED dataset; under a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.