Deep Double Descent via Smooth Interpolation
Matteo Gamba, Erik Englesson, M{\aa}rten Bj\"orkman, Hossein, Azizpour

TL;DR
This paper investigates how overparameterized deep networks interpolate noisy data and how their loss landscape sharpness relates to generalization, revealing a double descent phenomenon in the input space and contrasting existing intuition.
Contribution
It provides a novel analysis of loss landscape sharpness in deep networks, linking it to double descent behavior and the effects of model size and training epochs.
Findings
Loss sharpness exhibits double descent with model size and epochs.
Large models produce smoother loss landscapes around training data.
Noisy labels cause peaks in sharpness, especially in smaller models.
Abstract
The ability of overparameterized deep networks to interpolate noisy data, while at the same time showing good generalization performance, has been recently characterized in terms of the double descent curve for the test error. Common intuition from polynomial regression suggests that overparameterized networks are able to sharply interpolate noisy data, without considerably deviating from the ground-truth signal, thus preserving generalization ability. At present, a precise characterization of the relationship between interpolation and generalization for deep networks is missing. In this work, we quantify sharpness of fit of the training data interpolated by neural network functions, by studying the loss landscape w.r.t. to the input variable locally to each training point, over volumes around cleanly- and noisily-labelled training samples, as we systematically increase the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Model Reduction and Neural Networks · Advanced Neural Network Applications
MethodsTest · Linear Regression
