Fast Cross-Validation via Sequential Testing

Tammo Krueger; Danny Panknin; Mikio Braun

arXiv:1206.2248·cs.LG·February 5, 2016·24 cites

Fast Cross-Validation via Sequential Testing

Tammo Krueger, Danny Panknin, Mikio Braun

PDF

Open Access 1 Repo

TL;DR

This paper introduces a fast cross-validation method that uses sequential testing to efficiently identify optimal parameters, significantly reducing computation time while maintaining accuracy.

Contribution

It presents a novel nonparametric sequential testing approach for cross-validation that accelerates parameter selection in large datasets.

Findings

01

Reduces cross-validation time by up to 120 times

02

Maintains comparable accuracy to full cross-validation

03

Theoretically supported statistical power

Abstract

With the increasing size of today's data sets, finding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which uses nonparametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. By eliminating underperforming candidates quickly and keeping promising candidates as long as possible, the method speeds up the computation while preserving the capability of the full cross-validation. Theoretical considerations underline the statistical power of our procedure. The experimental evaluation shows that our method reduces the computation time by a factor of up to 120 compared to a full cross-validation with a negligible impact on the accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tammok/CVST
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Machine Learning and Algorithms · Machine Learning and Data Classification