Data-driven calibration of linear estimators with minimal penalties
Sylvain Arlot (LIENS, INRIA Paris - Rocquencourt), Francis Bach, (LIENS, INRIA Paris - Rocquencourt)

TL;DR
This paper introduces a data-driven method for calibrating linear estimators in non-parametric regression, using minimal penalties to improve model selection and parameter tuning, often outperforming existing methods.
Contribution
The authors develop a new algorithm that consistently estimates noise variance via minimal penalties and integrates it into Mallows' $C_L$ penalty, ensuring oracle inequalities.
Findings
The proposed method accurately estimates noise variance.
It improves calibration over generalized cross-validation.
Simulation results show significant performance gains.
Abstract
This paper tackles the problem of selecting among several linear estimators in non-parametric regression; this includes model selection for linear regression, the choice of a regularization parameter in kernel ridge regression, spline smoothing or locally weighted regression, and the choice of a kernel in multiple kernel learning. We propose a new algorithm which first estimates consistently the variance of the noise, based upon the concept of minimal penalty, which was previously introduced in the context of model selection. Then, plugging our variance estimate in Mallows' penalty is proved to lead to an algorithm satisfying an oracle inequality. Simulation experiments with kernel ridge regression and multiple kernel learning show that the proposed algorithm often improves significantly existing calibration procedures such as generalized cross-validation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Control Systems and Identification · Advanced Statistical Methods and Models
