Fast Cross-Validation via Sequential Testing
Tammo Krueger, Danny Panknin, Mikio Braun

TL;DR
This paper introduces a fast cross-validation method that uses sequential testing to efficiently identify optimal parameters, significantly reducing computation time while maintaining accuracy.
Contribution
It presents a novel nonparametric sequential testing approach for cross-validation that accelerates parameter selection in large datasets.
Findings
Reduces cross-validation time by up to 120 times
Maintains comparable accuracy to full cross-validation
Theoretically supported statistical power
Abstract
With the increasing size of today's data sets, finding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which uses nonparametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. By eliminating underperforming candidates quickly and keeping promising candidates as long as possible, the method speeds up the computation while preserving the capability of the full cross-validation. Theoretical considerations underline the statistical power of our procedure. The experimental evaluation shows that our method reduces the computation time by a factor of up to 120 compared to a full cross-validation with a negligible impact on the accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Algorithms · Machine Learning and Data Classification
