The distribution of Ridgeless least squares interpolators
Qiyang Han, Xiaocong Xu

TL;DR
This paper characterizes the distribution of Ridgeless least squares interpolators in high dimensions, revealing how implicit regularization affects prediction and estimation risks under general conditions.
Contribution
It provides a precise distributional analysis of Ridgeless interpolators via Gaussian sequence models, extending to non-Gaussian designs and general weighted risks.
Findings
Distribution of Ridgeless interpolator characterized in high dimensions.
Explicit formulas for weighted ll_q risks of Ridge estimators.
Cross-validation schemes can simultaneously optimize multiple risks.
Abstract
The Ridgeless minimum -norm interpolator in overparametrized linear regression has attracted considerable attention in recent years in both machine learning and statistics communities. While it seems to defy conventional wisdom that overfitting leads to poor prediction, recent theoretical research on its -type risks reveals that its norm minimizing property induces an `implicit regularization' that helps prediction in spite of interpolation. This paper takes a further step that aims at understanding its precise stochastic behavior as a statistical estimator. Specifically, we characterize the distribution of the Ridgeless interpolator in high dimensions, in terms of a Ridge estimator in an associated Gaussian sequence model with positive regularization, which provides a precise quantification of the prescribed implicit regularization in the most general distributional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Sparse and Compressive Sensing Techniques · Statistical Mechanics and Entropy
