Provable Generalization of SGD-trained Neural Networks of Any Width in   the Presence of Adversarial Label Noise

Spencer Frei; Yuan Cao; Quanquan Gu

arXiv:2101.01152·cs.LG·February 16, 2021

Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise

Spencer Frei, Yuan Cao, Quanquan Gu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper proves that SGD-trained one-hidden-layer leaky ReLU neural networks of any width can generalize well in the presence of adversarial label noise, matching the best halfspace classifiers for broad distribution classes.

Contribution

It establishes the first theoretical guarantee that overparameterized neural networks trained by SGD can generalize despite adversarial label noise.

Findings

01

Neural networks achieve accuracy comparable to the best halfspace.

02

Generalization holds for broad classes of distributions including log-concave and hard margin.

03

First proof of generalization under adversarial label noise for overparameterized networks.

Abstract

We consider a one-hidden-layer leaky ReLU network of arbitrary width trained by stochastic gradient descent (SGD) following an arbitrary initialization. We prove that SGD produces neural networks that have classification accuracy competitive with that of the best halfspace over the distribution for a broad class of distributions that includes log-concave isotropic and hard margin distributions. Equivalently, such networks can generalize when the data distribution is linearly separable but corrupted with adversarial label noise, despite the capacity to overfit. To the best of our knowledge, this is the first work to show that overparameterized neural networks trained by SGD can generalize when the data is corrupted with adversarial label noise.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

spencerfrei/nn_generalization_agnostic_noise
tfOfficial

Videos

Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Machine Learning and Data Classification

Methods*Communicated@Fast*How Do I Communicate to Expedia? · HuMan(Expedia)||How do I get a human at Expedia? · Stochastic Gradient Descent