FreeVPS: Repurposing Training-Free SAM2 for Generalizable Video Polyp Segmentation
Qiang Hu, Ying Zhou, Gepeng Ji, Nick Barnes, Qiang Li, Zhiwei Wang

TL;DR
FreeVPS introduces a training-free approach that repurposes SAM2 for stable, generalizable video polyp segmentation by mitigating error accumulation and enhancing temporal coherence in colonoscopy videos.
Contribution
It presents a novel training-free framework that combines spatial and temporal modeling for VPS, addressing error propagation and improving robustness in clinical scenarios.
Findings
Achieves state-of-the-art performance in in-domain and out-of-domain VPS tasks.
Effectively reduces false positives through intra-association filtering.
Maintains stable long-term tracking in colonoscopy videos.
Abstract
Existing video polyp segmentation (VPS) paradigms usually struggle to balance between spatiotemporal modeling and domain generalization, limiting their applicability in real clinical scenarios. To embrace this challenge, we recast the VPS task as a track-by-detect paradigm that leverages the spatial contexts captured by the image polyp segmentation (IPS) model while integrating the temporal modeling capabilities of segment anything model 2 (SAM2). However, during long-term polyp tracking in colonoscopy videos, SAM2 suffers from error accumulation, resulting in a snowball effect that compromises segmentation stability. We mitigate this issue by repurposing SAM2 as a video polyp segmenter with two training-free modules. In particular, the intra-association filtering module eliminates spatial inaccuracies originating from the detecting stage, reducing false positives. The inter-association…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
