Fundamental tradeoffs between memorization and robustness in random   features and neural tangent regimes

Elvis Dohmatob

arXiv:2106.02630·stat.ML·June 7, 2021

Fundamental tradeoffs between memorization and robustness in random features and neural tangent regimes

Elvis Dohmatob

PDF

Open Access 1 Repo

TL;DR

This paper explores the fundamental trade-offs between memorization and robustness in two-layer neural networks within high-dimensional linearized regimes, revealing tight bounds and phenomena like multiple descent in robustness.

Contribution

It establishes tight lower bounds on the Sobolev-seminorm for neural networks in various regimes, linking memorization to robustness and validating findings with empirical experiments.

Findings

01

Lower bounds on Sobolev-seminorm depend on network width and data dimensions.

02

Tight bounds are achieved by min-norm / least-squares interpolators.

03

Discovered a multiple-descent phenomenon in robustness of the interpolator.

Abstract

This work studies the (non)robustness of two-layer neural networks in various high-dimensional linearized regimes. We establish fundamental trade-offs between memorization and robustness, as measured by the Sobolev-seminorm of the model w.r.t the data distribution, i.e the square root of the average squared $L_{2}$ -norm of the gradients of the model w.r.t the its input. More precisely, if $n$ is the number of training examples, $d$ is the input dimension, and $k$ is the number of hidden neurons in a two-layer neural network, we prove for a large class of activation functions that, if the model memorizes even a fraction of the training, then its Sobolev-seminorm is lower-bounded by (i) $n$ in case of infinite-width random features (RF) or neural tangent kernel (NTK) with $d ≳ n$ ; (ii) $n$ in case of finite-width RF with proportionate scaling of $d$ and $k$ ; and (iii)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dohmatob/multiple-descent-robustness
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Adversarial Robustness in Machine Learning

MethodsNeural Tangent Kernel