Asymptotic Risk of Overparameterized Likelihood Models: Double Descent   Theory for Deep Neural Networks

Ryumei Nakada; Masaaki Imaizumi

arXiv:2103.00500·stat.ML·March 16, 2021·1 cites

Asymptotic Risk of Overparameterized Likelihood Models: Double Descent Theory for Deep Neural Networks

Ryumei Nakada, Masaaki Imaizumi

PDF

Open Access

TL;DR

This paper develops a theoretical framework to analyze the asymptotic risk of overparameterized deep neural networks, extending existing models to nonlinear, multi-layer architectures and explaining phenomena like double descent.

Contribution

It introduces a general asymptotic risk bound for overparameterized likelihood models, including deep neural networks, without linear-in-feature constraints, using spectral analysis and empirical process techniques.

Findings

01

Large deep models can have small asymptotic risk if they have specific structures.

02

The theory explains double descent and regularized risk curves.

03

Empirical validation with parallel deep neural networks supports the theory.

Abstract

We investigate the asymptotic risk of a general class of overparameterized likelihood models, including deep models. The recent empirical success of large-scale models has motivated several theoretical studies to investigate a scenario wherein both the number of samples, $n$ , and parameters, $p$ , diverge to infinity and derive an asymptotic risk at the limit. However, these theorems are only valid for linear-in-feature models, such as generalized linear regression, kernel regression, and shallow neural networks. Hence, it is difficult to investigate a wider class of nonlinear models, including deep neural networks with three or more layers. In this study, we consider a likelihood maximization problem without the model constraints and analyze the upper bound of an asymptotic risk of an estimator with penalization. Technically, we combine a property of the Fisher information matrix with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Sparse and Compressive Sensing Techniques