Bayes Error Rate Estimation in Difficult Situations
Lesley Wheat, Martin v. Mohrenschildt, Saeid Habibi

TL;DR
This paper evaluates various estimators of the Bayes Error Rate (BER) to determine their accuracy and practicality in difficult, high-dimensional classification scenarios, highlighting kNN's superior performance under sample constraints.
Contribution
The study provides a comprehensive comparison of BER estimators using synthetic and real-world scenarios, establishing minimum sample requirements and demonstrating kNN's robustness.
Findings
kNN outperforms other estimators in accuracy
Minimum of 1000 samples per class needed for reliable BER estimation
More features require more samples to maintain estimation accuracy
Abstract
The Bayes Error Rate (BER) is the fundamental limit on the achievable generalizable classification accuracy of any machine learning model due to inherent uncertainty within the data. BER estimators offer insight into the difficulty of any classification problem and set expectations for optimal classification performance. In order to be useful, the estimators must also be accurate with a limited number of samples on multivariate problems with unknown class distributions. To determine which estimators meet the minimum requirements for "usefulness", an in-depth examination of their accuracy is conducted using Monte Carlo simulations with synthetic data in order to obtain their confidence bounds for binary classification. To examine the usability of the estimators for real-world applications, new non-linear multi-modal test scenarios are introduced. In each scenario, 2500 Monte Carlo…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Statistical Methods and Models · Face and Expression Recognition
