Triple descent and the two kinds of overfitting: Where & why do they   appear?

St\'ephane d'Ascoli; Levent Sagun; Giulio Biroli

arXiv:2006.03509·cs.LG·January 12, 2022

Triple descent and the two kinds of overfitting: Where & why do they appear?

St\'ephane d'Ascoli, Levent Sagun, Giulio Biroli

PDF

1 Repo 1 Video

TL;DR

This paper distinguishes two types of overfitting peaks in neural networks, explaining their origins and how nonlinearity and noise influence their appearance, supported by theoretical analysis and neural network experiments.

Contribution

It clarifies the difference between linear and nonlinear overfitting peaks, introducing the concept of triple descent and analyzing their causes in noisy regression tasks.

Findings

01

Both peaks can coexist in neural networks with noise.

02

Nonlinearity influences the relative size of the peaks.

03

Regularization suppresses the nonlinear peak.

Abstract

A recent line of research has highlighted the existence of a "double descent" phenomenon in deep learning, whereby increasing the number of training examples $N$ causes the generalization error of neural networks to peak when $N$ is of the same order as the number of parameters $P$ . In earlier works, a similar phenomenon was shown to exist in simpler models such as linear regression, where the peak instead occurs when $N$ is equal to the input dimension $D$ . Since both peaks coincide with the interpolation threshold, they are often conflated in the litterature. In this paper, we show that despite their apparent similarity, these two scenarios are inherently different. In fact, both peaks can co-exist when neural networks are applied to noisy regression tasks. The relative size of the peaks is then governed by the degree of nonlinearity of the activation function. Building on recent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sdascoli/triple-descent-paper
pytorchOfficial

Videos

Triple descent and the two kinds of overfitting: where & why do they appear?· slideslive