# Two models of double descent for weak features

**Authors:** Mikhail Belkin, Daniel Hsu, Ji Xu

arXiv: 1903.07571 · 2020-12-22

## TL;DR

This paper provides a mathematical analysis of the double descent risk curve in simple data models, revealing how prediction risk peaks near the sample size and then decreases with more features, contrasting with prescient models.

## Contribution

It introduces two models of double descent, offering a precise mathematical understanding of the risk curve in least squares/least norm predictors.

## Key findings

- Risk peaks when features are near sample size
- Risk decreases as features exceed sample size
- Contrasts with prescient feature selection models

## Abstract

The "double descent" risk curve was proposed to qualitatively describe the out-of-sample prediction accuracy of variably-parameterized machine learning models. This article provides a precise mathematical analysis for the shape of this curve in two simple data models with the least squares/least norm predictor. Specifically, it is shown that the risk peaks when the number of features $p$ is close to the sample size $n$, but also that the risk decreases towards its minimum as $p$ increases beyond $n$. This behavior is contrasted with that of "prescient" models that select features in an a priori optimal order.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.07571/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1903.07571/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1903.07571/full.md

---
Source: https://tomesphere.com/paper/1903.07571