EndoVGGT: GNN-Enhanced Depth Estimation for Surgical 3D Reconstruction
Falong Fan, Yi Xie, Arnis Lektauers, Bo Liu, Jerzy Rozenblit

TL;DR
EndoVGGT introduces a GNN-based framework with a dynamic graph attention module to enhance 3D reconstruction of deformable tissues in surgery, improving accuracy and generalization over prior methods.
Contribution
The paper presents EndoVGGT, a novel geometry-centric approach with DeGAT that captures long-range tissue correlations for robust, domain-agnostic surgical 3D reconstruction.
Findings
Increased PSNR by 24.6% over prior methods.
Achieved strong zero-shot generalization to unseen datasets.
Enforced global consistency in non-rigid tissue deformation recovery.
Abstract
Accurate 3D reconstruction of deformable soft tissues is essential for surgical robotic perception. However, low-texture surfaces, specular highlights, and instrument occlusions often fragment geometric continuity, posing a challenge for existing fixed-topology approaches. To address this, we propose EndoVGGT, a geometry-centric framework equipped with a Deformation-aware Graph Attention (DeGAT) module. Rather than using static spatial neighborhoods, DeGAT dynamically constructs feature-space semantic graphs to capture long-range correlations among coherent tissue regions. This enables robust propagation of structural cues across occlusions, enforcing global consistency and improving non-rigid deformation recovery. Extensive experiments on SCARED show that our method significantly improves fidelity, increasing PSNR by 24.6% and SSIM by 9.1% over prior state-of-the-art. Crucially,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
