Cross-validation improved by aggregation: Agghoo
Guillaume Maillard (LMO, SELECT, LM-Orsay), Sylvain Arlot (LMO,, SELECT, LM-Orsay), Matthieu Lerasle (LMO, SELECT, LM-Orsay)

TL;DR
Agghoo, an aggregation-based method combining cross-validation and bagging, significantly improves prediction accuracy with theoretical guarantees, making it a promising general-purpose tool for supervised classification tasks.
Contribution
This paper introduces Agghoo, a novel aggregation method that enhances cross-validation, supported by theoretical guarantees and improved prediction performance.
Findings
Agghoo can outperform traditional cross-validation in prediction error.
Theoretical guarantees show Agghoo performs at least as well as hold-out.
Agghoo achieves minimax rates under the margin condition in binary classification.
Abstract
Cross-validation is widely used for selecting among a family of learning rules. This paper studies a related method, called aggregated hold-out (Agghoo), which mixes cross-validation with aggregation; Agghoo can also be related to bagging. According to numerical experiments, Agghoo can improve significantly cross-validation's prediction error, at the same computational cost; this makes it very promising as a general-purpose tool for prediction. We provide the first theoretical guarantees on Agghoo, in the supervised classification setting, ensuring that one can use it safely: at worse, Agghoo performs like the hold-out, up to a constant factor. We also prove a non-asymptotic oracle inequality, in binary classification under the margin condition, which is sharp enough to get (fast) minimax rates.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Machine Learning and Data Classification · Machine Learning and Algorithms
