Theoretical Analyses of Cross-Validation Error and Voting in Instance-Based Learning
Peter D. Turney (National Research Council of Canada)

TL;DR
This paper develops a general theory of cross-validation error in supervised learning, analyzing how error components and voting stability affect instance-based learning performance.
Contribution
It introduces a theoretical framework for understanding cross-validation error components and examines voting's impact on stability in instance-based learning.
Findings
Cross-validation error comprises inaccuracy and noise sensitivity.
Voting can both stabilize and destabilize instance-based learning.
Guidelines are provided to optimize voting stability and accuracy.
Abstract
This paper begins with a general theory of error in cross-validation testing of algorithms for supervised learning from examples. It is assumed that the examples are described by attribute-value pairs, where the values are symbolic. Cross-validation requires a set of training examples and a set of testing examples. The value of the attribute that is to be predicted is known to the learner in the training set, but unknown in the testing set. The theory demonstrates that cross-validation error has two components: error on the training set (inaccuracy) and sensitivity to noise (instability). This general theory is then applied to voting in instance-based learning. Given an example in the testing set, a typical instance-based learning algorithm predicts the designated attribute by voting among the k nearest neighbors (the k most similar examples) to the testing example in the training set.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Imbalanced Data Classification Techniques
