Spectral Prompt Tuning:Unveiling Unseen Classes for Zero-Shot Semantic Segmentation
Wenhao Xu, Rongtao Xu, Changwei Wang, Shibiao Xu, Li Guo, Man Zhang,, Xiaopeng Zhang

TL;DR
This paper introduces SPT-SEG, a one-stage zero-shot semantic segmentation method that leverages spectral prompt tuning and a spectral guided decoder to improve unseen class segmentation accuracy.
Contribution
It proposes Spectral Prompt Tuning and a Spectral Guided Decoder to enhance CLIP's pixel-level unseen class segmentation in a single-stage framework.
Findings
Outperforms state-of-the-art methods on public datasets.
Excels particularly in segmenting unseen classes.
Demonstrates robustness across all classes.
Abstract
Recently, CLIP has found practical utility in the domain of pixel-level zero-shot segmentation tasks. The present landscape features two-stage methodologies beset by issues such as intricate pipelines and elevated computational costs. While current one-stage approaches alleviate these concerns and incorporate Visual Prompt Training (VPT) to uphold CLIP's generalization capacity, they still fall short in fully harnessing CLIP's potential for pixel-level unseen class demarcation and precise pixel predictions. To further stimulate CLIP's zero-shot dense prediction capability, we propose SPT-SEG, a one-stage approach that improves CLIP's adaptability from image to pixel. Specifically, we initially introduce Spectral Prompt Tuning (SPT), incorporating spectral prompts into the CLIP visual encoder's shallow layers to capture structural intricacies of images, thereby enhancing comprehension of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · COVID-19 diagnosis using AI
MethodsSeventeen Ways to Call Uphold Helpline Full Guide USA 24 Hour Assistance · Focus · Contrastive Language-Image Pre-training
