Understanding the double descent curve in Machine Learning
Luis Sa-Couto, Jose Miguel Ramos, Miguel Almeida, Andreas Wichert

TL;DR
This paper investigates the double descent phenomenon in machine learning, providing a theoretical framework, exploring its implications for model selection, and validating predictions with experimental results.
Contribution
It offers a fundamental theoretical understanding of double descent and its impact on model selection, which was previously lacking.
Findings
Double descent occurs in over-parameterized models.
Theoretical predictions align with experimental results.
Insights into when double descent is expected to happen.
Abstract
The theory of bias-variance used to serve as a guide for model selection when applying Machine Learning algorithms. However, modern practice has shown success with over-parameterized models that were expected to overfit but did not. This led to the proposal of the double descent curve of performance by Belkin et al. Although it seems to describe a real, representative phenomenon, the field is lacking a fundamental theoretical understanding of what is happening, what are the consequences for model selection and when is double descent expected to occur. In this paper we develop a principled understanding of the phenomenon, and sketch answers to these important questions. Furthermore, we report real experimental results that are correctly predicted by our proposed hypothesis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Neural Networks and Applications
