Consistency of cross validation for comparing regression procedures

Yuhong Yang

arXiv:0803.2963·math.ST·December 18, 2008

Consistency of cross validation for comparing regression procedures

Yuhong Yang

PDF

TL;DR

This paper investigates the theoretical consistency of cross validation when used to compare regression procedures, including parametric and nonparametric methods, revealing conditions under which CV reliably selects the better method.

Contribution

It provides new theoretical insights into the conditions for cross validation consistency in comparing diverse regression procedures, especially in nonparametric settings.

Findings

01

Cross validation can be consistent with an appropriate data splitting ratio.

02

When models converge at the same nonparametric rate, the evaluation set size need not dominate.

03

The evaluation set can be smaller than the estimation set without losing consistency.

Abstract

Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finite-dimensional models (e.g., subset or order selection in linear regression) or selecting a smoothing parameter (e.g., bandwidth for kernel smoothing). However, little is known about consistency of cross validation when applied to compare between parametric and nonparametric methods or within nonparametric methods. We show that under some conditions, with an appropriate choice of data splitting ratio, cross validation is consistent in the sense of selecting the better procedure with probability approaching 1. Our results reveal interesting behavior of cross validation. When comparing two models (procedures) converging at the same nonparametric rate, in contrast to the parametric case, it turns out that the proportion of data used for evaluation in CV does not need to be dominating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.