Self-Promoted Supervision for Few-Shot Transformer
Bowen Dong, Pan Zhou, Shuicheng Yan, Wangmeng Zuo

TL;DR
This paper introduces SUN, a novel few-shot training framework for vision transformers that enhances token dependency learning and local semantics, significantly improving few-shot classification performance over CNNs and existing ViT methods.
Contribution
The paper proposes SUN, a simple yet effective self-promoted supervision method for ViTs, addressing their poor few-shot learning performance by generating location-specific supervisions.
Findings
SUN outperforms existing ViT-based few-shot methods.
SUN surpasses CNN-based state-of-the-art in few-shot classification.
The approach improves token dependency learning and local semantics understanding.
Abstract
The few-shot learning ability of vision transformers (ViTs) is rarely investigated though heavily desired. In this work, we empirically find that with the same few-shot learning frameworks, \eg~Meta-Baseline, replacing the widely used CNN feature extractor with a ViT model often severely impairs few-shot classification performance. Moreover, our empirical study shows that in the absence of inductive bias, ViTs often learn the low-qualified token dependencies under few-shot learning regime where only a few labeled training data are available, which largely contributes to the above performance degradation. To alleviate this issue, for the first time, we propose a simple yet effective few-shot training framework for ViTs, namely Self-promoted sUpervisioN (SUN). Specifically, besides the conventional global supervision for global semantic learning SUN further pretrains the ViT on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Advanced Memory and Neural Computing
