Overfitting has a limitation: a model-independent generalization gap bound based on R\'enyi entropy

Atsushi Suzuki; Jing Wang

arXiv:2506.00182·stat.ML·May 18, 2026

Overfitting has a limitation: a model-independent generalization gap bound based on R\'enyi entropy

Atsushi Suzuki, Jing Wang

PDF

TL;DR

This paper presents a model-independent upper bound on the generalization gap based on Rényi entropy, explaining why large models can generalize well if data quantity matches the data distribution's entropy.

Contribution

It introduces a novel Rényi entropy-based generalization gap bound applicable to data histogram-dependent algorithms, providing insights into overfitting and data noise effects.

Findings

01

The bound depends solely on the data distribution's Rényi entropy.

02

Large models can generalize well if data size exceeds the entropy.

03

Adding noise increases Rényi entropy, degrading generalization.

Abstract

Will further scaling up of machine learning models continue to bring success? A significant challenge in answering this question lies in understanding generalization gap, which is the impact of overfitting. Understanding generalization gap behavior of increasingly large-scale machine learning models remains a significant area of investigation, as conventional analyses often link error bounds to model complexity, failing to fully explain the success of extremely large architectures. This research introduces a novel perspective by establishing a model-independent upper bound for generalization gap applicable to algorithms whose outputs are determined solely by the data's histogram, such as empirical risk minimization or gradient-based methods. Crucially, this bound is shown to depend only on the R\'enyi entropy of the data-generating distribution, suggesting that a small generalization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification