Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers
Gabriele Prato, Simon Guiroy, Ethan Caballero, Irina Rish, Sarath, Chandar

TL;DR
This paper investigates how pre-training data scale influences few-shot image classification, revealing power-law relationships and faster convergence for new classes, providing insights into model generalization and scaling laws.
Contribution
It demonstrates that few-shot performance follows power laws with respect to training data size and differs between new and known classes, advancing understanding of scaling laws in vision models.
Findings
Few-shot performance follows power-law scaling.
Performance on new classes converges faster.
Scaling laws apply across different data domains.
Abstract
Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Advanced Neural Network Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Contrastive Language-Image Pre-training · Byte Pair Encoding · Dropout · Layer Normalization · Dense Connections · Cosine Annealing
