Scaling and renormalization in high-dimensional regression
Alexander Atanasov, Jacob A. Zavatone-Veth, Cengiz Pehlevan

TL;DR
This paper offers a unified analytical framework for understanding high-dimensional ridge regression, revealing how statistical fluctuations and weight structures influence model performance and scaling laws.
Contribution
It introduces a deterministic equivalence approach using free probability to derive explicit formulas for errors and scaling behaviors in ridge regression and related models.
Findings
Statistical fluctuations can be absorbed into a renormalized ridge parameter.
Derived explicit formulas for training and generalization errors.
Identified a variance-limited scaling regime in random feature models.
Abstract
From benign overfitting in overparameterized models to rich power-law scalings in performance, simple ridge regression displays surprising behaviors sometimes thought to be limited to deep neural networks. This balance of phenomenological richness with analytical tractability makes ridge regression the model system of choice in high-dimensional machine learning. In this paper, we present a unifying perspective on recent results on ridge regression using the basic tools of random matrix theory and free probability, aimed at readers with backgrounds in physics and deep learning. We highlight the fact that statistical fluctuations in empirical covariance matrices can be absorbed into a renormalization of the ridge parameter. This `deterministic equivalence' allows us to obtain analytic formulas for the training and generalization errors in a few lines of algebra by leveraging the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
