A Theory of Cross-Validation Error

Peter D. Turney (National Research Council of Canada)

arXiv:cs/0212029·cs.LG·May 23, 2007

A Theory of Cross-Validation Error

Peter D. Turney (National Research Council of Canada)

PDF

Open Access

TL;DR

This paper develops a theoretical framework for understanding and minimizing cross-validation error in predicting real-valued attributes, emphasizing the balance between model simplicity and accuracy.

Contribution

It introduces a general theory of cross-validation error and details its application to linear regression and instance-based learning.

Findings

01

The theory explains the trade-off between simplicity and accuracy in prediction models.

02

It provides a method to optimize the balance to minimize cross-validation error.

03

The framework is applicable to various predictive algorithms.

Abstract

This paper presents a theory of error in cross-validation testing of algorithms for predicting real-valued attributes. The theory justifies the claim that predicting real-valued attributes requires balancing the conflicting demands of simplicity and accuracy. Furthermore, the theory indicates precisely how these conflicting demands must be balanced, in order to minimize cross-validation error. A general theory is presented, then it is developed in detail for linear regression and instance-based learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Statistical Process Monitoring · Advanced Statistical Methods and Models