Comparison of classifiers in challenge scheme
Sergio Nava-Mu\~noz, Mario Graff Guerrero, Hugo Jair Escalante

TL;DR
This paper evaluates the performance of machine learning classifiers in challenge schemes, using resampling techniques to determine the significance of small performance differences among competitors.
Contribution
It introduces a method to assess the significance of classifier performance differences in challenge settings using bootstrap resampling.
Findings
Bootstrap resampling supports decision-making in challenge evaluations.
Small performance differences among classifiers may not be statistically significant.
The approach aids in fairer comparison of algorithms in constrained challenge environments.
Abstract
In recent decades, challenges have become very popular in scientific research as these are crowdsourcing schemes. In particular, challenges are essential for developing machine learning algorithms. For the challenges settings, it is vital to establish the scientific question, the dataset (with adequate quality, quantity, diversity, and complexity), performance metrics, as well as a way to authenticate the participants' results (Gold Standard). This paper addresses the problem of evaluating the performance of different competitors (algorithms) under the restrictions imposed by the challenge scheme, such as the comparison of multiple competitors with a unique dataset (with fixed size), a minimal number of submissions and, a set of metrics chosen to assess performance. The algorithms are sorted according to the performance metric. Still, it is common to observe performance differences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFood Supply Chain Traceability
