Universality of empirical risk minimization

Andrea Montanari; Basil Saeed

arXiv:2202.08832·math.ST·November 1, 2022·23 cites

Universality of empirical risk minimization

Andrea Montanari, Basil Saeed

PDF

Open Access

TL;DR

This paper demonstrates that in high-dimensional supervised learning, the training and test errors exhibit universality, depending only on the covariance structure of features, regardless of their specific distribution, including random features and neural tangent models.

Contribution

It establishes universality results for empirical risk minimization in high dimensions without requiring strong convexity or independence assumptions on features.

Findings

01

Training error depends only on feature covariance structure.

02

Test error of near-minimizers shows similar universality.

03

Results apply to random features and neural tangent models.

Abstract

Consider supervised learning from i.i.d. samples ${x_{i}, y_{i}}_{i \leq n}$ where $x_{i} \in R^{p}$ are feature vectors and $y \in R$ are labels. We study empirical risk minimization over a class of functions that are parameterized by $k = O (1)$ vectors $θ_{1}, ..., θ_{k} \in R^{p}$ , and prove universality results both for the training and test error. Namely, under the proportional asymptotics $n, p \to \infty$ , with $n / p = Θ (1)$ , we prove that the training error depends on the random features distribution only through its covariance structure. Further, we prove that the minimum test error over near-empirical risk minimizers enjoys similar universality properties. In particular, the asymptotics of these quantities can be computed $-$ to leading order $-$ under a simpler model in which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Neural Networks and Applications · Statistical Methods and Inference