Robust Generalization of Quadratic Neural Networks via Function Identification
Kan Xu, Hamsa Bastani, Osbert Bastani

TL;DR
This paper demonstrates that neural networks, specifically quadratic and ReLU-based models, can achieve robust generalization across distribution shifts by identifying the underlying function rather than the exact parameters, leading to improved bounds in various learning scenarios.
Contribution
The authors show that neural networks can generalize robustly by function identification despite parameter symmetries, extending this to ReLU networks and applying it to bandits and transfer learning.
Findings
Function identification enables robustness despite parameter symmetries.
New generalization bounds for quadratic neural networks.
Improved bounds for contextual bandits and transfer learning.
Abstract
A key challenge facing deep learning is that neural networks are often not robust to shifts in the underlying data distribution. We study this problem from the perspective of the statistical concept of parameter identification. Generalization bounds from learning theory often assume that the test distribution is close to the training distribution. In contrast, if we can identify the "true" parameters, then the model generalizes to arbitrary distribution shifts. However, neural networks typically have internal symmetries that make parameter identification impossible. We show that we can identify the function represented by a quadratic network even though we cannot identify its parameters; we extend this result to neural networks with ReLU activations. Thus, we can obtain robust generalization bounds for neural networks. We leverage this result to obtain new bounds for contextual bandits…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Advanced Bandit Algorithms Research
MethodsTest
