Oops, I Sampled it Again: Reinterpreting Confidence Intervals in   Few-Shot Learning

Raphael Lafargue; Luke Smith; Franck Vermet; Mathias L\"owe; Ian Reid,; Vincent Gripon; Jack Valmadre

arXiv:2409.02850·cs.LG·September 9, 2024

Oops, I Sampled it Again: Reinterpreting Confidence Intervals in Few-Shot Learning

Raphael Lafargue, Luke Smith, Franck Vermet, Mathias L\"owe, Ian Reid,, Vincent Gripon, Jack Valmadre

PDF

Open Access 1 Repo

TL;DR

This paper critically examines the common practice of sampling with replacement in computing confidence intervals for few-shot learning, revealing underestimation issues and proposing strategies for more accurate evaluation.

Contribution

It highlights the problem of underestimating confidence intervals in FSL due to sampling methods and introduces improved evaluation techniques and a new benchmark.

Findings

01

Sampling with replacement underestimates CI in FSL

02

Paired tests can partially correct CI underestimation

03

Strategic task sampling reduces CI size

Abstract

The predominant method for computing confidence intervals (CI) in few-shot learning (FSL) is based on sampling the tasks with replacement, i.e.\ allowing the same samples to appear in multiple tasks. This makes the CI misleading in that it takes into account the randomness of the sampler but not the data itself. To quantify the extent of this problem, we conduct a comparative analysis between CIs computed with and without replacement. These reveal a notable underestimation by the predominant method. This observation calls for a reevaluation of how we interpret confidence intervals and the resulting conclusions in FSL comparative studies. Our research demonstrates that the use of paired tests can partially address this issue. Additionally, we explore methods to further reduce the (size of the) CI by strategically sampling tasks of a specific size. We also introduce a new optimized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

raflaf/fsl-benchmark-again
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Algorithms · Machine Learning and Data Classification