The Choice of Normalization Influences Shrinkage in Regularized Regression
Johan Larsson, Jonas Wallin

TL;DR
This paper investigates how different feature normalization methods affect regularized regression models like lasso, ridge, and elastic net, revealing that normalization choices significantly influence model coefficients and suggesting strategies to mitigate these effects.
Contribution
It provides the first systematic analysis of normalization effects on regularized regression, offering practical guidelines for feature scaling in various scenarios.
Findings
Scaling binary features with their variance or standard deviation affects coefficients.
Normalizing penalty weights can help when features are mixed types.
Different normalization strategies impact the variance and bias of coefficient estimates.
Abstract
Regularized models are often sensitive to the scales of the features in the data and it has therefore become standard practice to normalize (center and scale) the features before fitting the model. But there are many different ways to normalize the features and the choice may have dramatic effects on the resulting model. In spite of this, there has so far been no research on this topic. In this paper, we begin to bridge this knowledge gap by studying normalization in the context of lasso, ridge, and elastic net regression. We focus on binary features and show that their class balances (proportions of ones) directly influences the regression coefficients and that this effect depends on the combination of normalization and regularization methods used. We demonstrate that this effect can be mitigated by scaling binary features with their variance in the case of the lasso and standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models
MethodsFocus
