Intermediate Prototype Mining Transformer for Few-Shot Semantic Segmentation
Yuanwei Liu, Nian Liu, Xiwen Yao, Junwei Han

TL;DR
This paper introduces an Intermediate Prototype Mining Transformer (IPMT) for few-shot semantic segmentation, which iteratively refines prototypes to better handle intra-class diversity and improves segmentation accuracy.
Contribution
The paper proposes the first use of an intermediate prototype in a transformer framework to adaptively mine category information from support and query images.
Findings
IPMT outperforms previous methods on PASCAL-5i and COCO-20i datasets.
Iterative prototype refinement improves segmentation accuracy.
The method effectively handles intra-class diversity.
Abstract
Few-shot semantic segmentation aims to segment the target objects in query under the condition of a few annotated support images. Most previous works strive to mine more effective category information from the support to match with the corresponding objects in query. However, they all ignored the category information gap between query and support images. If the objects in them show large intra-class diversity, forcibly migrating the category information from the support to the query is ineffective. To solve this problem, we are the first to introduce an intermediate prototype for mining both deterministic category information from the support and adaptive category knowledge from the query. Specifically, we design an Intermediate Prototype Mining Transformer (IPMT) to learn the prototype in an iterative way. In each IPMT layer, we propagate the object information in both support and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsMulti-Head Attention · Linear Layer · Byte Pair Encoding · Absolute Position Encodings · Layer Normalization · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Adam · Dense Connections
