Stronger generalization bounds for deep nets via a compression approach

Sanjeev Arora; Rong Ge; Behnam Neyshabur; Yi Zhang

arXiv:1802.05296·cs.LG·November 28, 2018·95 cites

Stronger generalization bounds for deep nets via a compression approach

Sanjeev Arora, Rong Ge, Behnam Neyshabur, Yi Zhang

PDF

Open Access

TL;DR

This paper introduces a compression-based framework that provides significantly tighter generalization bounds for deep neural networks, supported by new theoretical insights and empirical validation of noise stability properties.

Contribution

It presents a novel, explicit compression approach that yields superior generalization bounds and extends analysis to convolutional networks, explaining their empirical success.

Findings

01

Compression-based bounds outperform naive parameter counting

02

Deep nets exhibit noise stability properties

03

Bounds extend to convolutional neural networks

Abstract

Deep nets generalize well despite having more parameters than the number of training samples. Recent works try to give an explanation using PAC-Bayes and Margin-based analyses, but do not as yet result in sample complexity bounds better than naive parameter counting. The current paper shows generalization bounds that're orders of magnitude better in practice. These rely upon new succinct reparametrizations of the trained net --- a compression that is explicit and efficient. These yield generalization bounds via a simple compression-based framework introduced here. Our results also provide some theoretical justification for widespread empirical success in compressing deep nets. Analysis of correctness of our compression relies upon some newly identified \textquotedblleft noise stability\textquotedblright properties of trained deep nets, which are also experimentally verified. The study…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms