Comparing Comparators in Generalization Bounds
Fredrik Hellstr\"om, Benjamin Guedj

TL;DR
This paper introduces a general framework for deriving information-theoretic and PAC-Bayesian generalization bounds using arbitrary convex comparator functions, unifying and extending existing bounds with new optimality results.
Contribution
It develops a generic approach to bounds involving convex comparators, identifying the convex conjugate as the optimal comparator, and extends bounds to new distribution families.
Findings
Tightest bounds achieved with convex conjugate comparators.
Confirms near-optimality of existing bounds for bounded and sub-Gaussian losses.
Introduces novel bounds for other bounding distributions.
Abstract
We derive generic information-theoretic and PAC-Bayesian generalization bounds involving an arbitrary convex comparator function, which measures the discrepancy between the training and population loss. The bounds hold under the assumption that the cumulant-generating function (CGF) of the comparator is upper-bounded by the corresponding CGF within a family of bounding distributions. We show that the tightest possible bound is obtained with the comparator being the convex conjugate of the CGF of the bounding distribution, also known as the Cram\'er function. This conclusion applies more broadly to generalization bounds with a similar structure. This confirms the near-optimality of known bounds for bounded and sub-Gaussian losses and leads to novel bounds under other bounding distributions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Mechanics and Entropy · Distributed Sensor Networks and Detection Algorithms · Sparse and Compressive Sensing Techniques
