A Theory of Cross-Validation Error
Peter D. Turney (National Research Council of Canada)

TL;DR
This paper develops a theoretical framework for understanding and minimizing cross-validation error in predicting real-valued attributes, emphasizing the balance between model simplicity and accuracy.
Contribution
It introduces a general theory of cross-validation error and details its application to linear regression and instance-based learning.
Findings
The theory explains the trade-off between simplicity and accuracy in prediction models.
It provides a method to optimize the balance to minimize cross-validation error.
The framework is applicable to various predictive algorithms.
Abstract
This paper presents a theory of error in cross-validation testing of algorithms for predicting real-valued attributes. The theory justifies the claim that predicting real-valued attributes requires balancing the conflicting demands of simplicity and accuracy. Furthermore, the theory indicates precisely how these conflicting demands must be balanced, in order to minimize cross-validation error. A general theory is presented, then it is developed in detail for linear regression and instance-based learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Statistical Process Monitoring · Advanced Statistical Methods and Models
