Information-theoretic Characterizations of Generalization Error for the   Gibbs Algorithm

Gholamali Aminian; Yuheng Bu; Laura Toni; Miguel R. D. Rodrigues,; Gregory W. Wornell

arXiv:2210.09864·cs.IT·October 19, 2022·1 cites

Information-theoretic Characterizations of Generalization Error for the Gibbs Algorithm

Gholamali Aminian, Yuheng Bu, Laura Toni, Miguel R. D. Rodrigues,, Gregory W. Wornell

PDF

Open Access

TL;DR

This paper provides exact information-theoretic characterizations of the generalization error for the Gibbs algorithm, improving understanding and bounds of its generalization capabilities in supervised learning.

Contribution

It introduces new exact bounds using symmetrized KL information, applicable to regularized and asymptotic Gibbs algorithms, enhancing prior loose bounds.

Findings

01

Exact characterization of generalization error via symmetrized KL information

02

Tightens existing bounds and applies to regularized Gibbs algorithms

03

Highlights the role of symmetrized KL information in generalization control

Abstract

Various approaches have been developed to upper bound the generalization error of a supervised learning algorithm. However, existing bounds are often loose and even vacuous when evaluated in practice. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contributions are exact characterizations of the expected generalization error of the well-known Gibbs algorithm (a.k.a. Gibbs posterior) using different information measures, in particular, the symmetrized KL information between the input training samples and the output hypothesis. Our result can be applied to tighten existing expected generalization error and PAC-Bayesian bounds. Our information-theoretic approach is versatile, as it also characterizes the generalization error of the Gibbs algorithm with a data-dependent regularizer and that of the Gibbs algorithm in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Algorithms · Machine Learning and ELM