On Unified and Sharpened CMI Bounds for Generalization Errors

Yang Lu; Matthias Frey; Margreta Kuijper; Jingge Zhu

arXiv:2605.21056·cs.IT·May 21, 2026

On Unified and Sharpened CMI Bounds for Generalization Errors

Yang Lu, Matthias Frey, Margreta Kuijper, Jingge Zhu

PDF

TL;DR

This paper introduces a new family of information-theoretic generalization bounds based on conditional mutual information, unifying and sharpening existing bounds through a leave-$m$-out cross-validation framework.

Contribution

It proposes a unified CMI-based bound framework that encompasses and improves upon existing bounds, bridging MI- and CMI-based approaches for better generalization error estimation.

Findings

01

New bounds are sharper and more applicable than previous results.

02

The framework bridges the gap between MI- and CMI-based bounds as $m$ tends to infinity.

03

Empirical results show the new bounds outperform existing ones.

Abstract

We present a new family of information-theoretic generalization bounds within the framework of conditional mutual information (CMI). Most of our results are established based on the leave- $m$ -out (L $m$ O) cross-validation error, with $m$ denoting the number of the hold-out supersamples. Under this setting, we propose a unified CMI-based bound, allowing to envelop and reproduce many known CMI-based bounds and also bridge the gap between the MI- and CMI-based bounds when $m$ tends to infinity. The proposed framework not only provides a unified description of the existing bounds but also develops new, sharper bounds. We show the benefits of the proposed bounds through several simple examples, where the existing results are either inapplicable or looser. Moreover, under the premise that the loss function is bounded, we tighten the CMI quantities involved in the proposed bounds by reducing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.