# Cross-validation

**Authors:** Sylvain Arlot (SELECT, LMO)

arXiv: 1703.03167 · 2017-03-10

## TL;DR

This survey comprehensively reviews classical cross-validation procedures, analyzing their properties for risk estimation and estimator selection, providing guidelines for choosing the best method based on bias, variance, and overpenalization considerations.

## Contribution

It offers a detailed theoretical analysis of cross-validation methods, including bias, variance, and second-order effects, to guide their effective application in model evaluation.

## Key findings

- Bias and variance of cross-validation methods are quantified.
- Second-order effects and overpenalization are analyzed for estimator selection.
- Guidelines are provided for choosing appropriate cross-validation procedures.

## Abstract

This text is a survey on cross-validation. We define all classical cross-validation procedures, and we study their properties for two different goals: estimating the risk of a given estimator, and selecting the best estimator among a given family. For the risk estimation problem, we compute the bias (which can also be corrected) and the variance of cross-validation methods. For estimator selection, we first provide a first-order analysis (based on expectations). Then, we explain how to take into account second-order terms (from variance computations, and by taking into account the usefulness of overpenalization). This allows, in the end, to provide some guidelines for choosing the best cross-validation method for a given learning problem.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.03167/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1703.03167/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1703.03167/full.md

---
Source: https://tomesphere.com/paper/1703.03167