Bounding generalization error with input compression: An empirical study   with infinite-width networks

Angus Galloway; Anna Golubeva; Mahmoud Salem; Mihai Nica; Yani; Ioannou; Graham W. Taylor

arXiv:2207.09408·cs.LG·July 20, 2022·1 cites

Bounding generalization error with input compression: An empirical study with infinite-width networks

Angus Galloway, Anna Golubeva, Mahmoud Salem, Mihai Nica, Yani, Ioannou, Graham W. Taylor

PDF

Open Access

TL;DR

This paper empirically investigates a bound on the generalization error of deep neural networks based on input compression and mutual information, demonstrating its effectiveness in various scenarios including label randomization and robustness.

Contribution

First empirical study of an input compression-based generalization error bound using mutual information in infinite-width neural networks.

Findings

01

Bound is often tight for top models

02

Detects label randomization effectively

03

Correlates with test-time robustness

Abstract

Estimating the Generalization Error (GE) of Deep Neural Networks (DNNs) is an important task that often relies on availability of held-out data. The ability to better predict GE based on a single training set may yield overarching DNN design principles to reduce a reliance on trial-and-error, along with other performance assessment advantages. In search of a quantity relevant to GE, we investigate the Mutual Information (MI) between the input and final layer representations, using the infinite-width DNN limit to bound MI. An existing input compression-based GE bound is used to link MI and GE. To the best of our knowledge, this represents the first empirical study of this bound. In our attempt to empirically falsify the theoretical bound, we find that it is often tight for best-performing models. Furthermore, it detects randomization of training labels in many cases, reflects test-time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks