Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes
Xiaoqi Zhao, Youwei Pang, Shijie Chang, Yuan Zhao, Lihe Zhang, Chenyang Yu, Hanqi Liu, Jiaming Zuo, Jinsong Ouyang, Weisi Lin, Georges El Fakhri, Huchuan Lu, Xiaofeng Liu

TL;DR
This paper thoroughly evaluates SAM and SAM 2 on diverse context-dependent concepts across multiple modalities, revealing their strengths and limitations in understanding complex visual contexts and guiding future segmentation model development.
Contribution
It introduces a comprehensive evaluation framework for SAM and SAM 2 on 11 context-dependent concepts across various scenes and modalities, including prompt strategies and robustness testing.
Findings
SAM and SAM 2 perform well on context-independent concepts.
Performance varies significantly on context-dependent concepts.
Prompt robustness impacts segmentation accuracy in real-world scenarios.
Abstract
As large-scale foundation models trained on billions of image--mask pairs covering a vast diversity of scenes, objects, and contexts, SAM and its upgraded version, SAM~2, have significantly influenced multiple fields within computer vision. Leveraging such unprecedented data diversity, they exhibit strong open-world segmentation capabilities, with SAM~2 further enhancing these capabilities to support high-quality video segmentation. While SAMs (SAM and SAM~2) have demonstrated excellent performance in segmenting context-independent concepts like people, cars, and roads, they overlook more challenging context-dependent (CD) concepts, such as visual saliency, camouflage, industrial defects, and medical lesions. CD concepts rely heavily on global and local contextual information, making them susceptible to shifts in different contexts, which requires strong discriminative capabilities from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Software Engineering Techniques and Practices · Software Engineering Research
MethodsSegment Anything Model
