SASVi -- Segment Any Surgical Video

Ssharvien Kumar Sivakumar; Yannik Frisch; Amin Ranem; Anirban Mukhopadhyay

arXiv:2502.09653·eess.IV·July 2, 2025

SASVi -- Segment Any Surgical Video

Ssharvien Kumar Sivakumar, Yannik Frisch, Amin Ranem, Anirban Mukhopadhyay

PDF

Open Access 1 Repo 1 Models

TL;DR

SASVi introduces a re-prompting mechanism using a Mask R-CNN Overseer to improve temporal consistency in surgical video segmentation, enabling effective deployment of foundation models with minimal annotations.

Contribution

The paper presents a novel re-prompting approach with an Overseer model that enhances temporal segmentation consistency in surgical videos using limited annotated data.

Findings

01

Significant improvement in temporal consistency over existing methods.

02

Successful application of SAM2 to various surgical datasets.

03

Public release of extensive surgical video annotations.

Abstract

Purpose: Foundation models, trained on multitudes of public datasets, often require additional fine-tuning or re-prompting mechanisms to be applied to visually distinct target domains such as surgical videos. Further, without domain knowledge, they cannot model the specific semantics of the target domain. Hence, when applied to surgical video segmentation, they fail to generalise to sections where previously tracked objects leave the scene or new objects enter. Methods: We propose SASVi, a novel re-prompting mechanism based on a frame-wise Mask R-CNN Overseer model, which is trained on a minimal amount of scarcely available annotations for the target domain. This model automatically re-prompts the foundation model SAM2 when the scene constellation changes, allowing for temporally smooth and complete segmentation of full surgical videos. Results: Re-prompting based on our Overseer model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MECLabTUDA/SASVi
pytorchOfficial

Models

🤗
SsharvienKumar/SASVi
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAortic Thrombus and Embolism · Venous Thromboembolism Diagnosis and Management

MethodsRegion Proposal Network · Convolution · RoIAlign · Softmax · Mask R-CNN