Unifying Scale-Aware Depth Prediction and Perceptual Priors for Monocular Endoscope Pose Estimation and Tissue Reconstruction

Muzammil Khan; Enzo Kerkhof; Matteo Fusaglia; Koert Kuhlmann; Theo Ruers; Fran\c{c}oise J. Siepel

arXiv:2508.11282·cs.CV·August 18, 2025

Unifying Scale-Aware Depth Prediction and Perceptual Priors for Monocular Endoscope Pose Estimation and Tissue Reconstruction

Muzammil Khan, Enzo Kerkhof, Matteo Fusaglia, Koert Kuhlmann, Theo Ruers, Fran\c{c}oise J. Siepel

PDF

TL;DR

This paper introduces a unified framework combining scale-aware depth prediction and perceptual priors to improve monocular endoscope pose estimation and tissue reconstruction, addressing challenges like depth ambiguity and tissue deformation.

Contribution

The paper presents a novel integrated approach with modules like MAPIS-Depth and WEMA-RTDL, enhancing depth accuracy and registration in endoscopic tissue reconstruction.

Findings

01

Outperforms state-of-the-art methods on HEVD and SCARED datasets.

02

Demonstrates robustness against tissue deformation and motion artifacts.

03

Provides high-quality 3D tissue surface reconstructions.

Abstract

Accurate endoscope pose estimation and 3D tissue surface reconstruction significantly enhances monocular minimally invasive surgical procedures by enabling accurate navigation and improved spatial awareness. However, monocular endoscope pose estimation and tissue reconstruction face persistent challenges, including depth ambiguity, physiological tissue deformation, inconsistent endoscope motion, limited texture fidelity, and a restricted field of view. To overcome these limitations, a unified framework for monocular endoscopic tissue reconstruction that integrates scale-aware depth prediction with temporally-constrained perceptual refinement is presented. This framework incorporates a novel MAPIS-Depth module, which leverages Depth Pro for robust initialisation and Depth Anything for efficient per-frame depth prediction, in conjunction with L-BFGS-B optimisation, to generate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.