TL;DR
PrAda is a novel, parameter-efficient method that adapts frozen text-prompted segmentation models for improved performance on specialized target domains with minimal data.
Contribution
Introduces the problem of Few-Shot Visual Adaptation for text-prompted segmentation and proposes PrAda, a new prototype adaptation approach for effective domain-specific segmentation.
Findings
PrAda significantly outperforms state-of-the-art methods across multiple segmentation benchmarks.
The method effectively preserves zero-shot capabilities while enabling strong domain adaptation.
Experiments demonstrate PrAda's superiority in semantic, instance, and panoptic segmentation tasks.
Abstract
Segmenting images is critical for visual understanding but demands extensive pixel-level annotations. Foundational models have enabled new paradigms for predicting new classes guided by textual prompts, without annotations from the target domain. Yet, on specialized target domains, far from the original pre-training, their performance degrades. We study the errors of existing methods under such domain-shift, finding that misclassification rather than mask generation is the main culprit. To address this, we introduce the novel problem of Few-Shot Visual Adaptation for text-prompted Segmentation. This kind of adaptation has been largely studied for image classification, but it remains unexplored for segmentation. We tackle this task with Prototype Adaptation (PrAda), a novel, parameter-efficient method that adapts a frozen text-prompted segmentation model. Our approach learns…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
