Robust Generalization of Quadratic Neural Networks via Function   Identification

Kan Xu; Hamsa Bastani; Osbert Bastani

arXiv:2109.10935·cs.LG·February 18, 2022·1 cites

Robust Generalization of Quadratic Neural Networks via Function Identification

Kan Xu, Hamsa Bastani, Osbert Bastani

PDF

Open Access

TL;DR

This paper demonstrates that neural networks, specifically quadratic and ReLU-based models, can achieve robust generalization across distribution shifts by identifying the underlying function rather than the exact parameters, leading to improved bounds in various learning scenarios.

Contribution

The authors show that neural networks can generalize robustly by function identification despite parameter symmetries, extending this to ReLU networks and applying it to bandits and transfer learning.

Findings

01

Function identification enables robustness despite parameter symmetries.

02

New generalization bounds for quadratic neural networks.

03

Improved bounds for contextual bandits and transfer learning.

Abstract

A key challenge facing deep learning is that neural networks are often not robust to shifts in the underlying data distribution. We study this problem from the perspective of the statistical concept of parameter identification. Generalization bounds from learning theory often assume that the test distribution is close to the training distribution. In contrast, if we can identify the "true" parameters, then the model generalizes to arbitrary distribution shifts. However, neural networks typically have internal symmetries that make parameter identification impossible. We show that we can identify the function represented by a quadratic network even though we cannot identify its parameters; we extend this result to neural networks with ReLU activations. Thus, we can obtain robust generalization bounds for neural networks. We leverage this result to obtain new bounds for contextual bandits…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Advanced Bandit Algorithms Research

MethodsTest