Feature-Proxy Transformer for Few-Shot Segmentation
Jian-Wei Zhang, Yifan Sun, Yi Yang, Wei Chen

TL;DR
This paper introduces FPTrans, a simple yet effective feature-proxy transformer for few-shot segmentation that simplifies the framework by replacing complex decoders with a linear classification head and a prompting strategy.
Contribution
The paper proposes a novel FPTrans method that leverages a straightforward framework with a feature extractor and linear classification head, utilizing prompting and multiple background proxies.
Findings
Achieves competitive accuracy with state-of-the-art methods.
Utilizes a prompting strategy for better feature support interaction.
Employs multiple background proxies to handle background heterogeneity.
Abstract
Few-shot segmentation (FSS) aims at performing semantic segmentation on novel classes given a few annotated support samples. With a rethink of recent advances, we find that the current FSS framework has deviated far from the supervised segmentation framework: Given the deep features, FSS methods typically use an intricate decoder to perform sophisticated pixel-wise matching, while the supervised segmentation methods use a simple linear classification head. Due to the intricacy of the decoder and its matching pipeline, it is not easy to follow such an FSS framework. This paper revives the straightforward framework of "feature extractor linear classification head" and proposes a novel Feature-Proxy Transformer (FPTrans) method, in which the "proxy" is the vector representing a semantic class in the linear classification head. FPTrans has two keypoints for learning discriminative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
MethodsMulti-Head Attention · Linear Layer · Byte Pair Encoding · Absolute Position Encodings · Layer Normalization · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Adam · Dense Connections
