Eye image segmentation using visual and concept prompts with Segment Anything Model 3 (SAM3)
Diederick C. Niehorster, Marcus Nystr\"om

TL;DR
This study compares the latest SAM3 model with its predecessor SAM2 for eye image segmentation, finding that SAM2 remains superior in performance and speed across various datasets, despite testing new prompting modes.
Contribution
The paper evaluates SAM3's performance in eye segmentation, introduces code adaptations for video processing, and provides a comparative analysis with SAM2.
Findings
SAM3 does not outperform SAM2 in accuracy or speed.
SAM2 remains the preferred model for eye image segmentation.
Code for processing videos of arbitrary length is provided.
Abstract
Previous work has reported that vision foundation models show promising zero-shot performance in eye image segmentation. Here we examine whether the latest iteration of the Segment Anything Model, SAM3, offers better eye image segmentation performance than SAM2, and explore the performance of its new concept (text) prompting mode. Eye image segmentation performance was evaluated using diverse datasets encompassing both high-resolution high-quality videos from a lab environment and the TEyeD dataset consisting of challenging eye videos acquired in the wild. Results show that in most cases SAM3 with either visual or concept prompts did not perform better than SAM2, for both lab and in-the-wild datasets. Since SAM2 not only performed better but was also faster, we conclude that SAM2 remains the best option for eye image segmentation. We provide our adaptation of SAM3's codebase that allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Gaze Tracking and Assistive Technology · Retinal Imaging and Analysis
