Fantastic Generalization Measures and Where to Find Them
Yiding Jiang, Behnam Neyshabur, Hossein Mobahi, Dilip Krishnan, Samy, Bengio

TL;DR
This paper conducts the first large-scale empirical study of over 40 complexity measures across more than 10,000 deep networks to evaluate their effectiveness in predicting generalization performance.
Contribution
It systematically assesses the validity of various theoretical and empirical complexity measures on a broad set of models, revealing limitations and promising directions.
Findings
Some measures fail to reliably predict generalization.
Certain measures show potential for future research.
Large-scale analysis uncovers new insights into generalization metrics.
Abstract
Generalization of deep networks has been of great interest in recent years, resulting in a number of theoretically and empirically motivated complexity measures. However, most papers proposing such measures study only a small set of models, leaving open the question of whether the conclusion drawn from those experiments would remain valid in other settings. We present the first large scale study of generalization in deep networks. We investigate more then 40 complexity measures taken from both theoretical bounds and empirical studies. We train over 10,000 convolutional networks by systematically varying commonly used hyperparameters. Hoping to uncover potentially causal relationships between each measure and generalization, we analyze carefully controlled experiments and show surprising failures of some measures as well as promising measures for further research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Computability, Logic, AI Algorithms · Machine Learning and Data Classification
