Optimal ridge penalty for real-world high-dimensional data can be zero   or negative due to the implicit ridge regularization

Dmitry Kobak; Jonathan Lomond; Benoit Sanchez

arXiv:1805.10939·math.ST·June 6, 2024·24 cites

Optimal ridge penalty for real-world high-dimensional data can be zero or negative due to the implicit ridge regularization

Dmitry Kobak, Jonathan Lomond, Benoit Sanchez

PDF

Open Access 1 Repo

TL;DR

This paper reveals that in high-dimensional linear regression, the optimal ridge penalty can be zero or negative due to implicit regularization effects, challenging conventional wisdom about regularization necessity.

Contribution

It demonstrates through theory and experiments that negative ridge penalties can be optimal in high-dimensional settings, and connects implicit regularization with random covariates.

Findings

01

Optimal ridge penalty can be zero or negative in high-dimensional data.

02

Implicit regularization from low-variance directions can negate the need for positive ridge penalties.

03

Theoretical proof using spiked covariance model shows negative optimal penalty when n << p.

Abstract

A conventional wisdom in statistical learning is that large models require strong regularization to prevent overfitting. Here we show that this rule can be violated by linear regression in the underdetermined $n ≪ p$ situation under realistic conditions. Using simulations and real-life high-dimensional data sets, we demonstrate that an explicit positive ridge penalty can fail to provide any improvement over the minimum-norm least squares estimator. Moreover, the optimal value of ridge penalty in this situation can be negative. This happens when the high-variance directions in the predictor space can predict the response variable, which is often the case in the real-world high-dimensional data. In this regime, low-variance directions provide an implicit ridge regularization and can make any further positive ridge penalty detrimental. We prove that augmenting any linear model with random…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dkobak/high-dim-ridge
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Sparse and Compressive Sensing Techniques · Gaussian Processes and Bayesian Inference

MethodsLinear Regression