Few-Shot Image Classification Benchmarks are Too Far From Reality: Build Back Better with Semantic Task Sampling
Etienne Bennequin, Myriam Tami, Antoine Toubhans, Celine Hudelot

TL;DR
This paper critiques current few-shot image classification benchmarks for their lack of realism, introduces a semantically balanced benchmark, and proposes a new dataset to better evaluate models in practical, fine-grained, and large-scale scenarios.
Contribution
It exposes biases in existing benchmarks, proposes a semantic task sampling method to create more realistic benchmarks, and introduces a new dataset for diverse, large-scale few-shot classification evaluation.
Findings
Bias towards semantically dissimilar classes in current benchmarks
Performance drops significantly on more fine-grained and large-scale tasks
Semantic similarity correlates with task difficulty
Abstract
Every day, a new method is published to tackle Few-Shot Image Classification, showing better and better performances on academic benchmarks. Nevertheless, we observe that these current benchmarks do not accurately represent the real industrial use cases that we encountered. In this work, through both qualitative and quantitative studies, we expose that the widely used benchmark tieredImageNet is strongly biased towards tasks composed of very semantically dissimilar classes e.g. bathtub, cabbage, pizza, schipperke, and cardoon. This makes tieredImageNet (and similar benchmarks) irrelevant to evaluate the ability of a model to solve real-life use cases usually involving more fine-grained classification. We mitigate this bias using semantic information about the classes of tieredImageNet and generate an improved, balanced benchmark. Going further, we also introduce a new benchmark for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Machine Learning and Data Classification
