Sample Efficient Model Evaluation
Emine Yilmaz, Peter Hayes, Raza Habib, Jordan Burgess, David Barber

TL;DR
This paper introduces a novel Poisson Sampling method for selecting data points to label, significantly improving the efficiency of model evaluation compared to traditional Importance Sampling.
Contribution
The paper proposes a new Poisson Sampling approach for data labeling in model evaluation, with derived optimal distributions and demonstrated superior performance.
Findings
Poisson Sampling outperforms Importance Sampling in theory
Poisson Sampling provides more accurate estimators and confidence intervals
Experimental results confirm the efficiency gains of Poisson Sampling
Abstract
Labelling data is a major practical bottleneck in training and testing classifiers. Given a collection of unlabelled data points, we address how to select which subset to label to best estimate test metrics such as accuracy, score or micro/macro . We consider two sampling based approaches, namely the well-known Importance Sampling and we introduce a novel application of Poisson Sampling. For both approaches we derive the minimal error sampling distributions and how to approximate and use them to form estimators and confidence intervals. We show that Poisson Sampling outperforms Importance Sampling both theoretically and experimentally.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Machine Learning and Data Classification
MethodsTest
