A Classical View on Benign Overfitting: The Role of Sample Size

Junhyung Park; Patrick Bloebaum; Shiva Prasad Kasiviswanathan

arXiv:2505.11621·cs.LG·May 20, 2025

A Classical View on Benign Overfitting: The Role of Sample Size

Junhyung Park, Patrick Bloebaum, Shiva Prasad Kasiviswanathan

PDF

Open Access

TL;DR

This paper explores how classical statistical regimes can exhibit almost benign overfitting, where models fit training data perfectly yet still generalize well, supported by theoretical analysis of kernel and neural network models.

Contribution

It introduces a new perspective on benign overfitting, demonstrating its emergence in classical regimes through theoretical analysis of kernel ridge regression and neural networks.

Findings

01

Models achieve low training and test errors simultaneously.

02

Theoretical evidence supports benign overfitting in classical regimes.

03

A novel proof technique decomposes excess risk into estimation and approximation errors.

Abstract

Benign overfitting is a phenomenon in machine learning where a model perfectly fits (interpolates) the training data, including noisy examples, yet still generalizes well to unseen data. Understanding this phenomenon has attracted considerable attention in recent years. In this work, we introduce a conceptual shift, by focusing on almost benign overfitting, where models simultaneously achieve both arbitrarily small training and test errors. This behavior is characteristic of neural networks, which often achieve low (but non-zero) training error while still generalizing well. We hypothesize that this almost benign overfitting can emerge even in classical regimes, by analyzing how the interaction between sample size and model complexity enables larger models to achieve both good training fit but still approach Bayes-optimal generalization. We substantiate this hypothesis with theoretical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Softmax · Attention Is All You Need