Generalization Bounds: Perspectives from Information Theory and   PAC-Bayes

Fredrik Hellstr\"om; Giuseppe Durisi; Benjamin Guedj; Maxim Raginsky

arXiv:2309.04381·cs.LG·March 28, 2024·6 cites

Generalization Bounds: Perspectives from Information Theory and PAC-Bayes

Fredrik Hellstr\"om, Giuseppe Durisi, Benjamin Guedj, Maxim Raginsky

PDF

Open Access

TL;DR

This paper unifies information-theoretic and PAC-Bayesian perspectives on generalization in machine learning, highlighting their connections, differences, and applications, especially in deep learning.

Contribution

It provides a comprehensive, unified framework linking PAC-Bayesian and information-theoretic bounds, with insights into their common structures and implications for deep neural networks.

Findings

01

Unified treatment of PAC-Bayesian and information-theoretic bounds

02

Analysis of the conditional mutual information framework

03

Application of methods to deep learning models

Abstract

A fundamental question in theoretical machine learning is generalization. Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms, and design new ones. Recently, it has garnered increased interest due to its potential applicability for a variety of learning algorithms, including deep neural networks. In parallel, an information-theoretic view of generalization has developed, wherein the relation between generalization and various information measures has been established. This framework is intimately connected to the PAC-Bayesian approach, and a number of results have been independently discovered in both strands. In this monograph, we highlight this strong connection and present a unified treatment of PAC-Bayesian and information-theoretic generalization bounds. We present…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Statistical Mechanics and Entropy · Face and Expression Recognition