Few-Shot Image Classification Benchmarks are Too Far From Reality: Build   Back Better with Semantic Task Sampling

Etienne Bennequin; Myriam Tami; Antoine Toubhans; Celine Hudelot

arXiv:2205.05155·cs.CV·May 12, 2022

Few-Shot Image Classification Benchmarks are Too Far From Reality: Build Back Better with Semantic Task Sampling

Etienne Bennequin, Myriam Tami, Antoine Toubhans, Celine Hudelot

PDF

Open Access 1 Repo

TL;DR

This paper critiques current few-shot image classification benchmarks for their lack of realism, introduces a semantically balanced benchmark, and proposes a new dataset to better evaluate models in practical, fine-grained, and large-scale scenarios.

Contribution

It exposes biases in existing benchmarks, proposes a semantic task sampling method to create more realistic benchmarks, and introduces a new dataset for diverse, large-scale few-shot classification evaluation.

Findings

01

Bias towards semantically dissimilar classes in current benchmarks

02

Performance drops significantly on more fine-grained and large-scale tasks

03

Semantic similarity correlates with task difficulty

Abstract

Every day, a new method is published to tackle Few-Shot Image Classification, showing better and better performances on academic benchmarks. Nevertheless, we observe that these current benchmarks do not accurately represent the real industrial use cases that we encountered. In this work, through both qualitative and quantitative studies, we expose that the widely used benchmark tieredImageNet is strongly biased towards tasks composed of very semantically dissimilar classes e.g. bathtub, cabbage, pizza, schipperke, and cardoon. This makes tieredImageNet (and similar benchmarks) irrelevant to evaluate the ability of a model to solve real-life use cases usually involving more fine-grained classification. We mitigate this bias using semantic information about the classes of tieredImageNet and generate an improved, balanced benchmark. Going further, we also introduce a new benchmark for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sicara/semantic-task-sampling
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Machine Learning and Data Classification