Self-Prompting Polyp Segmentation in Colonoscopy using Hybrid Yolo-SAM 2 Model
Mobina Mansoori, Sajjad Shahabodini, Jamshid Abouei, Konstantinos N., Plataniotis, and Arash Mohammadi

TL;DR
This paper introduces a hybrid model combining YOLOv8 and SAM 2 for efficient polyp segmentation in colonoscopy images and videos, reducing annotation effort while surpassing existing methods in accuracy.
Contribution
The novel integration of YOLOv8 with SAM 2 enables autonomous prompt generation from bounding boxes, improving polyp segmentation without manual annotations.
Findings
Outperforms state-of-the-art models in segmentation accuracy
Reduces annotation time by using bounding box prompts
Effective on multiple benchmark datasets
Abstract
Early diagnosis and treatment of polyps during colonoscopy are essential for reducing the incidence and mortality of Colorectal Cancer (CRC). However, the variability in polyp characteristics and the presence of artifacts in colonoscopy images and videos pose significant challenges for accurate and efficient polyp detection and segmentation. This paper presents a novel approach to polyp segmentation by integrating the Segment Anything Model (SAM 2) with the YOLOv8 model. Our method leverages YOLOv8's bounding box predictions to autonomously generate input prompts for SAM 2, thereby reducing the need for manual annotations. We conducted exhaustive tests on five benchmark colonoscopy image datasets and two colonoscopy video datasets, demonstrating that our method exceeds state-of-the-art models in both image and video segmentation tasks. Notably, our approach achieves high segmentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsColorectal Cancer Screening and Detection · Gastric Cancer Management and Outcomes
MethodsSegment Anything Model · You Only Look Once
