Generalization Guarantees for Neural Networks via Harnessing the Low-rank Structure of the Jacobian
Samet Oymak, Zalan Fabian, Mingchen Li, Mahdi Soltanolkotabi

TL;DR
This paper develops a data-dependent theory explaining neural network generalization by leveraging the low-rank structure of the Jacobian, showing how training dynamics and dataset structure influence generalization and the role of information and nuisance spaces.
Contribution
It introduces a novel framework analyzing neural networks through the low-rank Jacobian spectrum, linking spectral properties to training speed, generalization, and dataset structure.
Findings
Jacobian matrices of neural networks exhibit low-rank structure with few large singular values.
Learning is fast in the information space and label alignment with this space improves generalization.
Label noise resides in the nuisance space, hindering optimization and generalization.
Abstract
Modern neural network architectures often generalize well despite containing many more parameters than the size of the training dataset. This paper explores the generalization capabilities of neural networks trained via gradient descent. We develop a data-dependent optimization and generalization theory which leverages the low-rank structure of the Jacobian matrix associated with the network. Our results help demystify why training and generalization is easier on clean and structured datasets and harder on noisy and unstructured datasets as well as how the network size affects the evolution of the train and test errors during training. Specifically, we use a control knob to split the Jacobian spectum into "information" and "nuisance" spaces associated with the large and small singular values. We show that over the information space learning is fast and one can quickly train a model with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Sparse and Compressive Sensing Techniques
MethodsAffine Coupling · Normalizing Flows · Early Stopping
