Information-Theoretic Bayes Risk Lower Bounds for Realizable Models

Matthew Nokleby; Ahmad Beirami

arXiv:2111.04579·cs.LG·November 9, 2021

Information-Theoretic Bayes Risk Lower Bounds for Realizable Models

Matthew Nokleby, Ahmad Beirami

PDF

Open Access

TL;DR

This paper establishes information-theoretic lower bounds on the Bayes risk and generalization error for realizable models, linking model complexity, mutual information, and risk scaling.

Contribution

It introduces a novel analysis using rate-distortion functions to bound the mutual information needed for learning in realizable models, extending to noisy settings.

Findings

01

Bayes risk scales as Omega(d_vc / n) for VC classes.

02

Mutual information is bounded by d_vc log(n) for VC classes.

03

Lower bounds hold even with label noise.

Abstract

We derive information-theoretic lower bounds on the Bayes risk and generalization error of realizable machine learning models. In particular, we employ an analysis in which the rate-distortion function of the model parameters bounds the required mutual information between the training samples and the model parameters in order to learn a model up to a Bayes risk constraint. For realizable models, we show that both the rate distortion function and mutual information admit expressions that are convenient for analysis. For models that are (roughly) lower Lipschitz in their parameters, we bound the rate distortion function from below, whereas for VC classes, the mutual information is bounded above by $d_{vc} lo g (n)$ . When these conditions match, the Bayes risk with respect to the zero-one loss scales no faster than $Ω (d_{vc} / n)$ , which matches known outer bounds and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · Machine Learning and Data Classification