No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
Lorenzo Brigato, Stavroula Mougiakakou

TL;DR
This paper explores alternative regularization techniques for training image classifiers on small datasets, achieving competitive accuracy without data augmentation or generative models by optimizing hyperparameters and model scaling.
Contribution
It introduces a heuristic for selecting optimal learning rate and weight decay, enabling effective training on small datasets without data augmentation.
Findings
Achieved 66.5% test accuracy on CIFAR-10 with only 1% of data.
Regularization strategies can match state-of-the-art results without data augmentation.
Hyperparameter tuning via model norm improves small dataset training.
Abstract
Solving image classification tasks given small training datasets remains an open challenge for modern computer vision. Aggressive data augmentation and generative models are among the most straightforward approaches to overcoming the lack of data. However, the first fails to be agnostic to varying image domains, while the latter requires additional compute and careful design. In this work, we study alternative regularization strategies to push the limits of supervised learning on small image classification datasets. In particular, along with the model size and training schedule scaling, we employ a heuristic to select (semi) optimal learning rate and weight decay couples via the norm of model parameters. By training on only 1% of the original CIFAR-10 training set (i.e., 50 images per class) and testing on ciFAIR-10, a variant of the original CIFAR without duplicated images, we reach a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
MethodsWeight Decay
