Visual Representation Learning with Self-Supervised Attention for Low-Label High-data Regime
Prarthana Bhattacharyya, Chenge Li, Xiaonan Zhao, Istv\'an, Feh\'erv\'ari, Jason Sun

TL;DR
This paper explores the adaptation of self-supervised vision transformers to low-label, high-data scenarios, demonstrating improved performance in few-shot classification and zero-shot retrieval without requiring manual annotations.
Contribution
It introduces a novel approach using self-supervised vision transformers for low-label regimes, achieving state-of-the-art results in few-shot and zero-shot tasks.
Findings
Outperforms state-of-the-art on miniImageNet and CUB200 for few-shot classification.
Achieves up to 11% improvement on zero-shot image retrieval benchmarks.
Demonstrates effectiveness without manual annotations in low-label settings.
Abstract
Self-supervision has shown outstanding results for natural language processing, and more recently, for image recognition. Simultaneously, vision transformers and its variants have emerged as a promising and scalable alternative to convolutions on various computer vision tasks. In this paper, we are the first to question if self-supervised vision transformers (SSL-ViTs) can be adapted to two important computer vision tasks in the low-label, high-data regime: few-shot image classification and zero-shot image retrieval. The motivation is to reduce the number of manual annotations required to train a visual embedder, and to produce generalizable and semantically meaningful embeddings. For few-shot image classification we train SSL-ViTs without any supervision, on external data, and use this trained embedder to adapt quickly to novel classes with limited number of labels. For zero-shot image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
