Universality of empirical risk minimization
Andrea Montanari, Basil Saeed

TL;DR
This paper demonstrates that in high-dimensional supervised learning, the training and test errors exhibit universality, depending only on the covariance structure of features, regardless of their specific distribution, including random features and neural tangent models.
Contribution
It establishes universality results for empirical risk minimization in high dimensions without requiring strong convexity or independence assumptions on features.
Findings
Training error depends only on feature covariance structure.
Test error of near-minimizers shows similar universality.
Results apply to random features and neural tangent models.
Abstract
Consider supervised learning from i.i.d. samples where are feature vectors and are labels. We study empirical risk minimization over a class of functions that are parameterized by vectors , and prove universality results both for the training and test error. Namely, under the proportional asymptotics , with , we prove that the training error depends on the random features distribution only through its covariance structure. Further, we prove that the minimum test error over near-empirical risk minimizers enjoys similar universality properties. In particular, the asymptotics of these quantities can be computed to leading order under a simpler model in which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Neural Networks and Applications · Statistical Methods and Inference
