Enhancing Few-shot Image Classification with Cosine Transformer
Quang-Huy Nguyen, Cuong Q. Nguyen, Dung D. Le, Hieu H. Pham

TL;DR
This paper introduces FS-CT, a novel few-shot image classification method using a cosine attention transformer, significantly improving accuracy and robustness over existing approaches, with applications demonstrated on standard datasets and a new Yoga pose dataset.
Contribution
The paper proposes a lightweight, effective few-shot classification framework with a novel cosine attention mechanism, enhancing relational mapping between support and query samples.
Findings
Improves accuracy by 5% to over 20% over baseline methods.
Achieves competitive results on mini-ImageNet, CUB-200, and CIFAR-FS.
Demonstrates practical application on Yoga pose recognition dataset.
Abstract
This paper addresses the few-shot image classification problem, where the classification task is performed on unlabeled query samples given a small amount of labeled support samples only. One major challenge of the few-shot learning problem is the large variety of object visual appearances that prevents the support samples to represent that object comprehensively. This might result in a significant difference between support and query samples, therefore undermining the performance of few-shot algorithms. In this paper, we tackle the problem by proposing Few-shot Cosine Transformer (FS-CT), where the relational map between supports and queries is effectively obtained for the few-shot tasks. The FS-CT consists of two parts, a learnable prototypical embedding network to obtain categorical representations from support samples with hard cases, and a transformer encoder to effectively achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
MethodsAttention Is All You Need · Linear Layer · Label Smoothing · Layer Normalization · Absolute Position Encodings · Multi-Head Attention · Dense Connections · Dropout · Residual Connection · Byte Pair Encoding
