Exact expressions for double descent and implicit regularization via surrogate random design
Micha{\l} Derezi\'nski, Feynman Liang, Michael W. Mahoney

TL;DR
This paper derives exact non-asymptotic formulas for double descent in linear regression using a novel surrogate random design, revealing the implicit regularization effect of over-parameterized models.
Contribution
It introduces a surrogate random design to obtain exact expressions for double descent and demonstrates the implicit regularization as ridge regression in over-parameterized linear models.
Findings
Exact formulas for mean squared error under surrogate design
Implicit bias of minimum norm estimator equals ridge regularization
New mathematical tools for random matrices with commuting determinants
Abstract
Double descent refers to the phase transition that is exhibited by the generalization error of unregularized learning models when varying the ratio between the number of parameters and the number of training samples. The recent success of highly over-parameterized machine learning models such as deep neural networks has motivated a theoretical analysis of the double descent phenomenon in classical models such as linear regression which can also generalize well in the over-parameterized regime. We provide the first exact non-asymptotic expressions for double descent of the minimum norm linear estimator. Our approach involves constructing a special determinantal point process which we call surrogate random design, to replace the standard i.i.d. design of the training sample. This surrogate design admits exact expressions for the mean squared error of the estimator while preserving the key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Random Matrices and Applications · Markov Chains and Monte Carlo Methods
MethodsLinear Regression
