$\left( \beta, \varpi \right)$-stability for cross-validation and the choice of the number of folds
Ning Xu, Jian Hong, Timothy C.G. Fisher

TL;DR
This paper introduces a new stability concept for cross-validation, called (β, ω)-stability, linking generalization and stability through Rademacher complexity, and provides bounds that inform optimal fold choices for stable, accurate model evaluation.
Contribution
It develops the (β, ω)-stability framework for cross-validation, deriving new Rademacher bounds applicable to both i.i.d. and non-i.i.d. data, and guides optimal fold selection.
Findings
New (β, ω)-stability concept connects generalization and stability.
Derived bounds quantify stability of cross-validation in various settings.
Empirical results suggest optimal number of folds minimizes test error bound.
Abstract
In this paper, we introduce a new concept of stability for cross-validation, called the -stability, and use it as a new perspective to build the general theory for cross-validation. The -stability mathematically connects the generalization ability and the stability of the cross-validated model via the Rademacher complexity. Our result reveals mathematically the effect of cross-validation from two sides: on one hand, cross-validation picks the model with the best empirical generalization ability by validating all the alternatives on test sets; on the other hand, cross-validation may compromise the stability of the model selection by causing subsampling error. Moreover, the difference between training and test errors in q\textsuperscript{th} round, sometimes referred to as the generalization error, might be autocorrelated on q.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Fault Detection and Control Systems · Statistical Methods and Inference
