SSP-SAM: SAM with Semantic-Spatial Prompt for Referring Expression Segmentation
Wei Tang, Xuejing Liu, Yanpeng Sun, and Zechao Li

TL;DR
SSP-SAM enhances the Segment Anything Model by integrating a Semantic-Spatial Prompt encoder with visual and linguistic attention, significantly improving referring expression segmentation accuracy and flexibility in open-vocabulary scenarios.
Contribution
This paper introduces SSP-SAM, a novel framework that combines visual and linguistic attention adapters to improve SAM's performance on RES and GRES tasks.
Findings
Achieves high-quality segmentation masks with strong precision at [email protected].
Outperforms existing RES methods on benchmark datasets.
Supports flexible GRES settings without additional modifications.
Abstract
The Segment Anything Model (SAM) excels at general image segmentation but has limited ability to understand natural language, which restricts its direct application in Referring Expression Segmentation (RES). Toward this end, we propose SSP-SAM, a framework that fully utilizes SAM's segmentation capabilities by integrating a Semantic-Spatial Prompt (SSP) encoder. Specifically, we incorporate both visual and linguistic attention adapters into the SSP encoder, which highlight salient objects within the visual features and discriminative phrases within the linguistic features. This design enhances the referent representation for the prompt generator, resulting in high-quality SSPs that enable SAM to generate precise masks guided by language. Although not specifically designed for Generalized RES (GRES), where the referent may correspond to zero, one, or multiple objects, SSP-SAM naturally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Advanced Neural Network Applications
