Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin
Colin Wei, Tengyu Ma

TL;DR
This paper introduces the all-layer margin concept for deep networks, providing tighter generalization bounds, analyzing robustness, and proposing a training algorithm that improves performance.
Contribution
It proposes the all-layer margin as a new analysis tool, leading to better theoretical bounds and practical training methods for deep neural networks.
Findings
Tighter generalization bounds depending on Jacobian and layer norms
First direct analysis of robust test error for deep networks
Training algorithm that improves clean and robust test performance
Abstract
For linear classifiers, the relationship between (normalized) output margin and generalization is captured in a clear and simple bound -- a large output margin implies good generalization. Unfortunately, for deep models, this relationship is less clear: existing analyses of the output margin give complicated bounds which sometimes depend exponentially on depth. In this work, we propose to instead analyze a new notion of margin, which we call the "all-layer margin." Our analysis reveals that the all-layer margin has a clear and direct relationship with generalization for deep models. This enables the following concrete applications of the all-layer margin: 1) by analyzing the all-layer margin, we obtain tighter generalization bounds for neural nets which depend on Jacobian and hidden layer norms and remove the exponential dependency on depth 2) our neural net results easily translate to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Machine Learning and Algorithms
MethodsTest
