Exact expressions for double descent and implicit regularization via   surrogate random design

Micha{\l} Derezi\'nski; Feynman Liang; Michael W. Mahoney

arXiv:1912.04533·cs.LG·June 19, 2020·33 cites

Exact expressions for double descent and implicit regularization via surrogate random design

Micha{\l} Derezi\'nski, Feynman Liang, Michael W. Mahoney

PDF

Open Access 1 Video

TL;DR

This paper derives exact non-asymptotic formulas for double descent in linear regression using a novel surrogate random design, revealing the implicit regularization effect of over-parameterized models.

Contribution

It introduces a surrogate random design to obtain exact expressions for double descent and demonstrates the implicit regularization as ridge regression in over-parameterized linear models.

Findings

01

Exact formulas for mean squared error under surrogate design

02

Implicit bias of minimum norm estimator equals ridge regularization

03

New mathematical tools for random matrices with commuting determinants

Abstract

Double descent refers to the phase transition that is exhibited by the generalization error of unregularized learning models when varying the ratio between the number of parameters and the number of training samples. The recent success of highly over-parameterized machine learning models such as deep neural networks has motivated a theoretical analysis of the double descent phenomenon in classical models such as linear regression which can also generalize well in the over-parameterized regime. We provide the first exact non-asymptotic expressions for double descent of the minimum norm linear estimator. Our approach involves constructing a special determinantal point process which we call surrogate random design, to replace the standard i.i.d. design of the training sample. This surrogate design admits exact expressions for the mean squared error of the estimator while preserving the key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Exact expressions for double descent and implicit regularization via surrogate random design· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Random Matrices and Applications · Markov Chains and Monte Carlo Methods

MethodsLinear Regression