Subsampled Randomized Fourier GaLore for Adapting Foundation Models in Depth-Driven Liver Landmark Segmentation
Yun-Chen Lin, Jiayuan Huang, Hanyuan Zhang, Sergi Kavtaradze, Matthew J. Clarkson, Mobarak I. Hoque

TL;DR
This paper introduces a novel depth-guided liver landmark segmentation framework using foundation models, employing a low-rank gradient projection method called SRFT-GaLore for efficient adaptation and demonstrating improved accuracy and robustness in surgical imaging.
Contribution
We propose SRFT-GaLore, a low-rank gradient projection technique that enables efficient fine-tuning of large vision models for medical image segmentation, along with a dual-encoder framework integrating RGB and depth cues.
Findings
Achieved 4.85% higher Dice score on L3D dataset.
Reduced surface distance by 11.78 points compared to baselines.
Maintained strong cross-dataset generalization on LLSD.
Abstract
Accurate detection and delineation of anatomical structures in medical imaging are critical for computer-assisted interventions, particularly in laparoscopic liver surgery where 2D video streams limit depth perception and complicate landmark localization. While recent works have leveraged monocular depth cues for enhanced landmark detection, challenges remain in fusing RGB and depth features and in efficiently adapting large-scale vision models to surgical domains. We propose a depth-guided liver landmark segmentation framework integrating semantic and geometric cues via vision foundation encoders. We employ Segment Anything Model V2 (SAM2) encoder to extract RGB features and Depth Anything V2 (DA2) encoder to extract depth-aware features. To efficiently adapt SAM2, we introduce SRFT-GaLore, a novel low-rank gradient projection method that replaces the computationally expensive SVD with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurgical Simulation and Training · Advanced Neural Network Applications · Soft Robotics and Applications
