Double Trouble in Double Descent : Bias and Variance(s) in the Lazy   Regime

St\'ephane d'Ascoli; Maria Refinetti; Giulio Biroli; Florent Krzakala

arXiv:2003.01054·cs.LG·April 6, 2020·53 cites

Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime

St\'ephane d'Ascoli, Maria Refinetti, Giulio Biroli, Florent Krzakala

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper provides a detailed theoretical analysis of the double descent phenomenon in neural networks within the lazy learning regime, highlighting the roles of bias, variance, and ensemble methods in overparametrization.

Contribution

It introduces a precise asymptotic bias-variance decomposition for high-dimensional random features regression, revealing phase transitions and the impact of ensemble averaging.

Findings

01

Bias exhibits a phase transition at the interpolation threshold.

02

Ensemble averaging suppresses variance contributions, stabilizing test error.

03

Results qualitatively extend to realistic deep learning scenarios.

Abstract

Deep neural networks can achieve remarkable generalization performances while interpolating the training data perfectly. Rather than the U-curve emblematic of the bias-variance trade-off, their test error often follows a "double descent" - a mark of the beneficial role of overparametrization. In this work, we develop a quantitative theory for this phenomenon in the so-called lazy learning regime of neural networks, by considering the problem of learning a high-dimensional function with random features regression. We obtain a precise asymptotic expression for the bias-variance decomposition of the test error, and show that the bias displays a phase transition at the interpolation threshold, beyond which it remains constant. We disentangle the variances stemming from the sampling of the dataset, from the additive noise corrupting the labels, and from the initialization of the weights.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis · Machine Learning and Data Classification