SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
Claudia Cuttano, Gabriele Trivigno, Giuseppe Averta, Carlo Masone

TL;DR
SANSA leverages the rich semantic features of SAM2, repurposing it for few-shot segmentation with minimal modifications, achieving state-of-the-art results and flexible prompt support.
Contribution
The paper introduces SANSA, a framework that explicitly aligns SAM2's latent semantic structure for improved few-shot segmentation performance.
Findings
SANSA outperforms existing methods on few-shot segmentation benchmarks.
Supports various prompts like points, boxes, or scribbles.
Faster and more compact than prior approaches.
Abstract
Few-shot segmentation aims to segment unseen object categories from just a handful of annotated examples. This requires mechanisms that can both identify semantically related objects across images and accurately produce segmentation masks. We note that Segment Anything 2 (SAM2), with its prompt-and-propagate mechanism, offers both strong segmentation capabilities and a built-in feature matching process. However, we show that its representations are entangled with task-specific cues optimized for object tracking, which impairs its use for tasks requiring higher level semantic understanding. Our key insight is that, despite its class-agnostic pretraining, SAM2 already encodes rich semantic structure in its features. We propose SANSA (Semantically AligNed Segment Anything 2), a framework that makes this latent structure explicit, and repurposes SAM2 for few-shot segmentation through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
