Information-theoretic Characterizations of Generalization Error for the Gibbs Algorithm
Gholamali Aminian, Yuheng Bu, Laura Toni, Miguel R. D. Rodrigues,, Gregory W. Wornell

TL;DR
This paper provides exact information-theoretic characterizations of the generalization error for the Gibbs algorithm, improving understanding and bounds of its generalization capabilities in supervised learning.
Contribution
It introduces new exact bounds using symmetrized KL information, applicable to regularized and asymptotic Gibbs algorithms, enhancing prior loose bounds.
Findings
Exact characterization of generalization error via symmetrized KL information
Tightens existing bounds and applies to regularized Gibbs algorithms
Highlights the role of symmetrized KL information in generalization control
Abstract
Various approaches have been developed to upper bound the generalization error of a supervised learning algorithm. However, existing bounds are often loose and even vacuous when evaluated in practice. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contributions are exact characterizations of the expected generalization error of the well-known Gibbs algorithm (a.k.a. Gibbs posterior) using different information measures, in particular, the symmetrized KL information between the input training samples and the output hypothesis. Our result can be applied to tighten existing expected generalization error and PAC-Bayesian bounds. Our information-theoretic approach is versatile, as it also characterizes the generalization error of the Gibbs algorithm with a data-dependent regularizer and that of the Gibbs algorithm in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Algorithms · Machine Learning and ELM
