Generalization Bounds via Convex Analysis
G\'abor Lugosi, Gergely Neu

TL;DR
This paper extends generalization bounds in supervised learning by replacing mutual information with strongly convex dependence measures, enabling bounds for heavy-tailed and smooth loss functions using convex analysis.
Contribution
It introduces a framework to replace mutual information with any strongly convex dependence measure for deriving generalization bounds, broadening applicability.
Findings
Bounds in terms of p-norm divergences and Wasserstein-2 distance.
Applicable to heavy-tailed loss distributions.
Applicable to highly smooth loss functions.
Abstract
Since the celebrated works of Russo and Zou (2016,2019) and Xu and Raginsky (2017), it has been well known that the generalization error of supervised learning algorithms can be bounded in terms of the mutual information between their input and the output, given that the loss of any fixed hypothesis has a subgaussian tail. In this work, we generalize this result beyond the standard choice of Shannon's mutual information to measure the dependence between the input and the output. Our main result shows that it is indeed possible to replace the mutual information by any strongly convex function of the joint input-output distribution, with the subgaussianity condition on the losses replaced by a bound on an appropriately chosen norm capturing the geometry of the dependence measure. This allows us to derive a range of generalization bounds that are either entirely new or strengthen…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Domain Adaptation and Few-Shot Learning · Statistical Methods and Inference
