An Information-Theoretic Perspective on Overfitting and Underfitting
Daniel Bashir, George D. Montanez, Sonia Sehra, Pedro Sandoval Segura,, Julius Lauw

TL;DR
This paper introduces an information-theoretic framework to analyze overfitting and underfitting in machine learning, proving the undecidability of predicting overfitting for arbitrary algorithms and relating capacity measures to generalization.
Contribution
It formalizes the concept of algorithm capacity using information transfer and establishes bounds and relationships with existing theoretical frameworks.
Findings
Proves the undecidability of overfitting prediction.
Provides upper bounds on algorithm capacity.
Links capacity measures to generalization theory.
Abstract
We present an information-theoretic framework for understanding overfitting and underfitting in machine learning and prove the formal undecidability of determining whether an arbitrary classification algorithm will overfit a dataset. Measuring algorithm capacity via the information transferred from datasets to models, we consider mismatches between algorithm capacities and datasets to provide a signature for when a model can overfit or underfit a dataset. We present results upper-bounding algorithm capacity, establish its relationship to quantities in the algorithmic search framework for machine learning, and relate our work to recent information-theoretic approaches to generalization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
