Bounding generalization error with input compression: An empirical study with infinite-width networks
Angus Galloway, Anna Golubeva, Mahmoud Salem, Mihai Nica, Yani, Ioannou, Graham W. Taylor

TL;DR
This paper empirically investigates a bound on the generalization error of deep neural networks based on input compression and mutual information, demonstrating its effectiveness in various scenarios including label randomization and robustness.
Contribution
First empirical study of an input compression-based generalization error bound using mutual information in infinite-width neural networks.
Findings
Bound is often tight for top models
Detects label randomization effectively
Correlates with test-time robustness
Abstract
Estimating the Generalization Error (GE) of Deep Neural Networks (DNNs) is an important task that often relies on availability of held-out data. The ability to better predict GE based on a single training set may yield overarching DNN design principles to reduce a reliance on trial-and-error, along with other performance assessment advantages. In search of a quantity relevant to GE, we investigate the Mutual Information (MI) between the input and final layer representations, using the infinite-width DNN limit to bound MI. An existing input compression-based GE bound is used to link MI and GE. To the best of our knowledge, this represents the first empirical study of this bound. In our attempt to empirically falsify the theoretical bound, we find that it is often tight for best-performing models. Furthermore, it detects randomization of training labels in many cases, reflects test-time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
