Understanding the double descent curve in Machine Learning

Luis Sa-Couto; Jose Miguel Ramos; Miguel Almeida; Andreas Wichert

arXiv:2211.10322·cs.LG·November 21, 2022

Understanding the double descent curve in Machine Learning

Luis Sa-Couto, Jose Miguel Ramos, Miguel Almeida, Andreas Wichert

PDF

Open Access

TL;DR

This paper investigates the double descent phenomenon in machine learning, providing a theoretical framework, exploring its implications for model selection, and validating predictions with experimental results.

Contribution

It offers a fundamental theoretical understanding of double descent and its impact on model selection, which was previously lacking.

Findings

01

Double descent occurs in over-parameterized models.

02

Theoretical predictions align with experimental results.

03

Insights into when double descent is expected to happen.

Abstract

The theory of bias-variance used to serve as a guide for model selection when applying Machine Learning algorithms. However, modern practice has shown success with over-parameterized models that were expected to overfit but did not. This led to the proposal of the double descent curve of performance by Belkin et al. Although it seems to describe a real, representative phenomenon, the field is lacking a fundamental theoretical understanding of what is happening, what are the consequences for model selection and when is double descent expected to occur. In this paper we develop a principled understanding of the phenomenon, and sketch answers to these important questions. Furthermore, we report real experimental results that are correctly predicted by our proposed hypothesis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Neural Networks and Applications