Model selection by resampling penalization

Sylvain Arlot (LIENS; INRIA Rocquencourt)

arXiv:0906.3124·math.ST·June 19, 2009

Model selection by resampling penalization

Sylvain Arlot (LIENS, INRIA Rocquencourt)

PDF

TL;DR

This paper introduces a new family of resampling-based penalization methods for model selection that adapt to heteroscedasticity and outperform V-fold cross-validation in prediction accuracy.

Contribution

It generalizes existing resampling penalties to any exchangeable bootstrap scheme and proves their near-optimality and adaptability to heteroscedastic noise in regression models.

Findings

01

Resampling penalties satisfy a non-asymptotic oracle inequality.

02

They adapt to both smoothness and heteroscedasticity of the data.

03

They outperform V-fold cross-validation in prediction error, especially with low signal-to-noise ratio.

Abstract

In this paper, a new family of resampling-based penalization procedures for model selection is defined in a general framework. It generalizes several methods, including Efron's bootstrap penalization and the leave-one-out penalization recently proposed by Arlot (2008), to any exchangeable weighted bootstrap resampling scheme. In the heteroscedastic regression framework, assuming the models to have a particular structure, these resampling penalties are proved to satisfy a non-asymptotic oracle inequality with leading constant close to 1. In particular, they are asympotically optimal. Resampling penalties are used for defining an estimator adapting simultaneously to the smoothness of the regression function and to the heteroscedasticity of the noise. This is remarkable because resampling penalties are general-purpose devices, which have not been built specifically to handle…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.