Prototype as Query for Few Shot Semantic Segmentation
Leilei Cao, Yibo Guo, Ye Yuan, Qiangguo Jin

TL;DR
ProtoFormer introduces a Transformer-based framework for few-shot semantic segmentation that effectively captures spatial details and improves segmentation accuracy on standard benchmarks.
Contribution
It proposes a novel Transformer-based approach that models spatial details using support prototypes as queries, enhancing segmentation performance in few-shot scenarios.
Findings
Achieves state-of-the-art results on PASCAL-5i and COCO-20i datasets.
Effectively captures spatial details with reduced computational cost.
Outperforms existing methods in few-shot semantic segmentation.
Abstract
Few-shot Semantic Segmentation (FSS) was proposed to segment unseen classes in a query image, referring to only a few annotated examples named support images. One of the characteristics of FSS is spatial inconsistency between query and support targets, e.g., texture or appearance. This greatly challenges the generalization ability of methods for FSS, which requires to effectively exploit the dependency of the query image and the support examples. Most existing methods abstracted support features into prototype vectors and implemented the interaction with query features using cosine similarity or feature concatenation. However, this simple interaction may not capture spatial details in query features. To alleviate this limitation, a few methods utilized all pixel-wise support information via computing the pixel-wise correlations between paired query and support features implemented with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Softmax · Adam · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings
