Sensitivity and Generalization in Neural Networks: an Empirical Study

Roman Novak; Yasaman Bahri; Daniel A. Abolafia; Jeffrey Pennington,; Jascha Sohl-Dickstein

arXiv:1802.08760·stat.ML·June 20, 2018·225 cites

Sensitivity and Generalization in Neural Networks: an Empirical Study

Roman Novak, Yasaman Bahri, Daniel A. Abolafia, Jeffrey Pennington,, Jascha Sohl-Dickstein

PDF

Open Access

TL;DR

This empirical study investigates the relationship between neural network complexity, robustness to input perturbations, and generalization, revealing that more robust models tend to generalize better across various architectures and datasets.

Contribution

The paper provides extensive empirical evidence linking input-output Jacobian norm to neural network generalization and robustness, highlighting factors that influence this relationship.

Findings

01

Robustness to input perturbations correlates with better generalization.

02

Factors like data augmentation and ReLU improve robustness and generalization.

03

Jacobian norm can predict generalization at individual test points.

Abstract

In practice it is often found that large over-parameterized neural networks generalize better than their smaller counterparts, an observation that appears to conflict with classical notions of function complexity, which typically favor smaller models. In this work, we investigate this tension between complexity and generalization through an extensive empirical exploration of two natural metrics of complexity related to sensitivity to input perturbations. Our experiments survey thousands of models with various fully-connected architectures, optimizers, and other hyper-parameters, as well as four different image classification datasets. We find that trained neural networks are more robust to input perturbations in the vicinity of the training data manifold, as measured by the norm of the input-output Jacobian of the network, and that it correlates well with generalization. We further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications

Methods*Communicated@Fast*How Do I Communicate to Expedia?