KNN Transformer with Pyramid Prompts for Few-Shot Learning
Wenhao Li, Qiangchang Wang, Peng Zhao, and Yilong Yin

TL;DR
This paper introduces KNN Transformer with Pyramid Prompts (KTPP), a novel approach for Few-Shot Learning that enhances semantic feature interaction and suppresses irrelevant information, leading to improved recognition with limited data.
Contribution
The paper proposes KTPP, which integrates K-NN context attention and pyramid prompts to better capture semantic relationships and adapt visual features in few-shot learning scenarios.
Findings
Outperforms existing methods on four benchmark datasets.
Effectively suppresses irrelevant tokens during self-attention.
Enhances robustness to spatial variations in visual features.
Abstract
Few-Shot Learning (FSL) aims to recognize new classes with limited labeled data. Recent studies have attempted to address the challenge of rare samples with textual prompts to modulate visual features. However, they usually struggle to capture complex semantic relationships between textual and visual features. Moreover, vanilla self-attention is heavily affected by useless information in images, severely constraining the potential of semantic priors in FSL due to the confusion of numerous irrelevant tokens during interaction. To address these aforementioned issues, a K-NN Transformer with Pyramid Prompts (KTPP) is proposed to select discriminative information with K-NN Context Attention (KCA) and adaptively modulate visual features with Pyramid Cross-modal Prompts (PCP). First, for each token, the KCA only selects the K most relevant tokens to compute the self-attention matrix and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Dense Connections · Residual Connection · Dropout · Layer Normalization · Adam · Byte Pair Encoding · Absolute Position Encodings · k-Nearest Neighbors · Softmax
