Scaling Laws for the Few-Shot Adaptation of Pre-trained Image   Classifiers

Gabriele Prato; Simon Guiroy; Ethan Caballero; Irina Rish; Sarath; Chandar

arXiv:2110.06990·cs.LG·October 20, 2021

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

Gabriele Prato, Simon Guiroy, Ethan Caballero, Irina Rish, Sarath, Chandar

PDF

Open Access

TL;DR

This paper investigates how pre-training data scale influences few-shot image classification, revealing power-law relationships and faster convergence for new classes, providing insights into model generalization and scaling laws.

Contribution

It demonstrates that few-shot performance follows power laws with respect to training data size and differs between new and known classes, advancing understanding of scaling laws in vision models.

Findings

01

Few-shot performance follows power-law scaling.

02

Performance on new classes converges faster.

03

Scaling laws apply across different data domains.

Abstract

Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Advanced Neural Network Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Contrastive Language-Image Pre-training · Byte Pair Encoding · Dropout · Layer Normalization · Dense Connections · Cosine Annealing