Sparse Spatial Transformers for Few-Shot Learning
Haoxing Chen, Huaxiong Li, Yaohui Li, Chunlin Chen

TL;DR
This paper introduces SSFormers, a transformer-based approach that enhances few-shot learning by identifying task-relevant image patches through sparse spatial transformers, improving classification accuracy with limited data.
Contribution
The paper proposes a novel sparse spatial transformer architecture that selectively emphasizes relevant image patches, effectively capturing contextual information for few-shot learning.
Findings
Outperforms state-of-the-art few-shot learning methods on benchmark datasets.
Effectively identifies task-relevant features using sparse spatial transformers.
Demonstrates improved generalization with limited training data.
Abstract
Learning from limited data is challenging because data scarcity leads to a poor generalization of the trained model. A classical global pooled representation will probably lose useful local information. Many few-shot learning methods have recently addressed this challenge using deep descriptors and learning a pixel-level metric. However, using deep descriptors as feature representations may lose image contextual information. Moreover, most of these methods independently address each class in the support set, which cannot sufficiently use discriminative information and task-specific embeddings. In this paper, we propose a novel transformer-based neural network architecture called sparse spatial transformers (SSFormers), which finds task-relevant features and suppresses task-irrelevant features. Particularly, we first divide each input image into several image patches of different sizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Geophysical Methods and Applications · Machine Learning and ELM
MethodsAttention Is All You Need · Linear Layer · Dropout · Layer Normalization · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Dense Connections · Byte Pair Encoding · Spatial Transformer
