Systematic Evaluation and Guidelines for Segment Anything Model in Surgical Video Analysis
Cheng Yuan, Jian Jiang, Kunyi Yang, Lv Wu, Rui Wang, Zi Meng, Haonan Ping, Ziyu Xu, Yifan Zhou, Wanli Song, Hesheng Wang, Yueming Jin, Qi Dou, and Yutong Ban

TL;DR
This paper systematically evaluates the zero-shot performance of the SAM2 model in diverse surgical video datasets, revealing its strengths and limitations in complex surgical environments and guiding future adaptive solutions.
Contribution
It provides the first comprehensive assessment of SAM2's zero-shot capabilities in surgical videos across multiple procedures and challenges, offering valuable insights for future research.
Findings
SAM2 performs well in structured scenarios like instrument and multi-organ segmentation
Performance varies with surgical complexity and dynamic conditions
Highlights need for domain-specific adaptation and temporal coherence improvements
Abstract
Surgical video segmentation is critical for AI to interpret spatial-temporal dynamics in surgery, yet model performance is constrained by limited annotated data. The SAM2 model, pretrained on natural videos, offers potential for zero-shot surgical segmentation, but its applicability in complex surgical environments, with challenges like tissue deformation and instrument variability, remains unexplored. We present the first comprehensive evaluation of the zero-shot capability of SAM2 in 9 surgical datasets (17 surgery types), covering laparoscopic, endoscopic, and robotic procedures. We analyze various prompting (points, boxes, mask) and {finetuning (dense, sparse) strategies}, robustness to surgical challenges, and generalization across procedures and anatomies. Key findings reveal that while SAM2 demonstrates notable zero-shot adaptability in structured scenarios (e.g., instrument…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Imaging in Medicine · Surgical Simulation and Training
