Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning
Sebastian Raschka

TL;DR
This paper reviews techniques for model evaluation, selection, and algorithm comparison in machine learning, providing practical recommendations and discussing their advantages, disadvantages, and empirical performance, especially for small datasets.
Contribution
It offers a comprehensive overview of evaluation and selection methods, including best practices, statistical tests, and alternative approaches like nested cross-validation, with empirical insights.
Findings
Bootstrap methods effectively estimate performance uncertainty.
Optimal k in k-fold cross-validation balances bias and variance.
Nested cross-validation is recommended for small datasets.
Abstract
The correct use of model evaluation, model selection, and algorithm selection techniques is vital in academic machine learning research as well as in many industrial settings. This article reviews different techniques that can be used for each of these three subtasks and discusses the main advantages and disadvantages of each technique with references to theoretical and empirical studies. Further, recommendations are given to encourage best yet feasible practices in research and applications of machine learning. Common methods such as the holdout method for model evaluation and selection are covered, which are not recommended when working with small datasets. Different flavors of the bootstrap technique are introduced for estimating the uncertainty of performance estimates, as an alternative to confidence intervals via normal approximation if bootstrapping is computationally feasible.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Statistical Methods and Inference · Statistical Methods and Bayesian Inference
