Rethinking Bias-Variance Trade-off for Generalization of Neural Networks

Zitong Yang; Yaodong Yu; Chong You; Jacob Steinhardt; Yi Ma

arXiv:2002.11328·cs.LG·December 9, 2020·81 cites

Rethinking Bias-Variance Trade-off for Generalization of Neural Networks

Zitong Yang, Yaodong Yu, Chong You, Jacob Steinhardt, Yi Ma

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper challenges the classical bias-variance trade-off in neural networks, showing that variance is unimodal and that the risk curve's shape depends on the bias and variance interplay, with implications for generalization.

Contribution

It provides empirical and theoretical insights into the bias and variance behavior of neural networks, revealing variance's unimodal nature and its impact on the risk curve shape.

Findings

01

Variance is unimodal, increasing then decreasing with network width.

02

Risk curves vary based on bias and variance balance, including double descent.

03

Deeper models reduce bias but increase variance across data types.

Abstract

The classical bias-variance trade-off predicts that bias decreases and variance increase with model complexity, leading to a U-shaped risk curve. Recent work calls this into question for neural networks and other over-parameterized models, for which it is often observed that larger models generalize better. We provide a simple explanation for this by measuring the bias and variance of neural networks: while the bias is monotonically decreasing as in the classical theory, the variance is unimodal or bell-shaped: it increases then decreases with the width of the network. We vary the network architecture, loss function, and choice of dataset and confirm that variance unimodality occurs robustly for all models we considered. The risk curve is the sum of the bias and variance curves and displays different qualitative shapes depending on the relative scale of bias and variance, with the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yaodongyu/Rethink-BiasVariance-Tradeoff
pytorchOfficial

Videos

Rethinking Bias-Variance Trade-off for Generalization of Neural Networks· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques · Advanced Neural Network Applications