See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors
Kunyi Yang, Qingyu Wang, Cheng Yuan, Yutong Ban

TL;DR
This paper introduces DepSeg, a training-free surgical scene segmentation method that leverages monocular depth priors and pretrained vision models to achieve annotation-efficient segmentation in laparoscopic scenes.
Contribution
DepSeg is the first training-free framework combining monocular depth estimation and pretrained vision models for surgical scene segmentation, reducing annotation costs.
Findings
DepSeg outperforms baseline auto segmentation with 35.9% vs. 14.7% mIoU.
Maintains competitive performance with only 10-20% of object templates.
Depth-guided prompting and template classification improve segmentation efficiency.
Abstract
Pixel-wise segmentation of laparoscopic scenes is essential for computer-assisted surgery but difficult to scale due to the high cost of dense annotations. We propose depth-guided surgical scene segmentation (DepSeg), a training-free framework that utilizes monocular depth as a geometric prior together with pretrained vision foundation models. DepSeg first estimates a relative depth map with a pretrained monocular depth estimation network and proposes depth-guided point prompts, which SAM2 converts into class-agnostic masks. Each mask is then described by a pooled pretrained visual feature and classified via template matching against a template bank built from annotated frames. On the CholecSeg8k dataset, DepSeg improves over a direct SAM2 auto segmentation baseline (35.9% vs. 14.7% mIoU) and maintains competitive performance even when using only 10--20% of the object templates. These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Neural Network Applications · Surgical Simulation and Training
