Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression
Ning Xu, Jian Hong, Timothy C.G. Fisher

TL;DR
This paper introduces a new framework called generalization error minimization (GEM) for model evaluation and selection, providing theoretical bounds and practical methods for penalized regression models to improve external validity.
Contribution
The paper develops a unified GEM framework for penalized regression, deriving theoretical properties and practical guidelines for model selection and validation.
Findings
Derived upper bounds for generalization errors based on sample size and model complexity.
Unified analysis of lasso, ridge, and bridge estimators under GEM.
Proposed GR^2 as a new measure of generalization ability.
Abstract
We study model evaluation and model selection from the perspective of generalization ability (GA): the ability of a model to predict outcomes in new samples from the same population. We believe that GA is one way formally to address concerns about the external validity of a model. The GA of a model estimated on a sample can be measured by its empirical out-of-sample errors, called the generalization errors (GE). We derive upper bounds for the GE, which depend on sample sizes, model complexity and the distribution of the loss function. The upper bounds can be used to evaluate the GA of a model, ex ante. We propose using generalization error minimization (GEM) as a framework for model selection. Using GEM, we are able to unify a big class of penalized regression estimators, including lasso, ridge and bridge, under the same set of assumptions. We establish finite-sample and asymptotic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Causal Inference Techniques · Advanced Statistical Methods and Models
