Class-relevant Patch Embedding Selection for Few-Shot Image Classification
Weihao Jiang, Haoyang Cui, Kun He

TL;DR
This paper introduces a method for selecting class-relevant patch embeddings in few-shot image classification, improving pattern recognition by filtering out irrelevant patches and enhancing model performance.
Contribution
The proposed approach uses class embeddings to filter and select top patch embeddings, effectively improving few-shot classification accuracy with a simple and efficient method.
Findings
Outperforms state-of-the-art baselines in 5-shot and 1-shot tasks
Enhances pattern recognition by filtering irrelevant patches
Proves effective and computationally efficient
Abstract
Effective image classification hinges on discerning relevant features from both foreground and background elements, with the foreground typically holding the critical information. While humans adeptly classify images with limited exposure, artificial neural networks often struggle with feature selection from rare samples. To address this challenge, we propose a novel method for selecting class-relevant patch embeddings. Our approach involves splitting support and query images into patches, encoding them using a pre-trained Vision Transformer (ViT) to obtain class embeddings and patch embeddings, respectively. Subsequently, we filter patch embeddings using class embeddings to retain only the class-relevant ones. For each image, we calculate the similarity between class embedding and each patch embedding, sort the similarity sequence in descending order, and only retain top-ranked patch…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · AI in cancer detection · Medical Imaging and Analysis
MethodsAttention Is All You Need · Dropout · Label Smoothing · Residual Connection · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Vision Transformer · Linear Layer · Feature Selection
