Data-driven calibration of linear estimators with minimal penalties

Sylvain Arlot (LIENS; INRIA Paris - Rocquencourt); Francis Bach; (LIENS; INRIA Paris - Rocquencourt)

arXiv:0909.1884·math.ST·September 15, 2011·NeurIPS·31 cites

Data-driven calibration of linear estimators with minimal penalties

Sylvain Arlot (LIENS, INRIA Paris - Rocquencourt), Francis Bach, (LIENS, INRIA Paris - Rocquencourt)

PDF

Open Access

TL;DR

This paper introduces a data-driven method for calibrating linear estimators in non-parametric regression, using minimal penalties to improve model selection and parameter tuning, often outperforming existing methods.

Contribution

The authors develop a new algorithm that consistently estimates noise variance via minimal penalties and integrates it into Mallows' $C_L$ penalty, ensuring oracle inequalities.

Findings

01

The proposed method accurately estimates noise variance.

02

It improves calibration over generalized cross-validation.

03

Simulation results show significant performance gains.

Abstract

This paper tackles the problem of selecting among several linear estimators in non-parametric regression; this includes model selection for linear regression, the choice of a regularization parameter in kernel ridge regression, spline smoothing or locally weighted regression, and the choice of a kernel in multiple kernel learning. We propose a new algorithm which first estimates consistently the variance of the noise, based upon the concept of minimal penalty, which was previously introduced in the context of model selection. Then, plugging our variance estimate in Mallows' $C_{L}$ penalty is proved to lead to an algorithm satisfying an oracle inequality. Simulation experiments with kernel ridge regression and multiple kernel learning show that the proposed algorithm often improves significantly existing calibration procedures such as generalized cross-validation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Control Systems and Identification · Advanced Statistical Methods and Models