# Tightening Mutual Information Based Bounds on Generalization Error

**Authors:** Yuheng Bu, Shaofeng Zou, Venugopal V. Veeravalli

arXiv: 1901.04609 · 2020-08-06

## TL;DR

This paper introduces a new, tighter information-theoretic bound on the generalization error of supervised learning algorithms, applicable under broad conditions and easily estimable in practice.

## Contribution

It presents a novel mutual information-based bound that improves upon existing bounds in tightness and applicability, especially for noisy and iterative algorithms.

## Key findings

- The new bound is tighter than previous bounds.
- It applies to a broad class of loss functions.
- The bound can be practically estimated from data.

## Abstract

An information-theoretic upper bound on the generalization error of supervised learning algorithms is derived. The bound is constructed in terms of the mutual information between each individual training sample and the output of the learning algorithm. The bound is derived under more general conditions on the loss function than in existing studies; nevertheless, it provides a tighter characterization of the generalization error. Examples of learning algorithms are provided to demonstrate the the tightness of the bound, and to show that it has a broad range of applicability. Application to noisy and iterative algorithms, e.g., stochastic gradient Langevin dynamics (SGLD), is also studied, where the constructed bound provides a tighter characterization of the generalization error than existing results. Finally, it is demonstrated that, unlike existing bounds, which are difficult to compute and evaluate empirically, the proposed bound can be estimated easily in practice.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.04609/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1901.04609/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1901.04609/full.md

---
Source: https://tomesphere.com/paper/1901.04609