Impact of base dataset design on few-shot image classification
Othman Sbai, Camille Couprie, Mathieu Aubry

TL;DR
This paper systematically investigates how the design of the base dataset influences the quality of deep features and performance in few-shot image classification, providing practical insights for dataset construction.
Contribution
It offers a comprehensive analysis of dataset design choices and their impact on few-shot classification, highlighting how dataset structure can outperform advanced algorithms.
Findings
Dataset similarity affects classification performance
Optimal class and image count depends on annotation budget
Dataset design can surpass algorithm improvements
Abstract
The quality and generality of deep image features is crucially determined by the data they have been trained on, but little is known about this often overlooked effect. In this paper, we systematically study the effect of variations in the training data by evaluating deep features trained on different image sets in a few-shot classification setting. The experimental protocol we define allows to explore key practical questions. What is the influence of the similarity between base and test classes? Given a fixed annotation budget, what is the optimal trade-off between the number of images per class and the number of classes? Given a fixed dataset, can features be improved by splitting or combining different classes? Should simple or diverse classes be annotated? In a wide range of experiments, we provide clear answers to these questions on the miniImageNet, ImageNet and CUB-200…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
