When does Gaussian equivalence fail and how to fix it: Non-universal behavior of random features with quadratic scaling
Garrett G. Wen, Hong Hu, Yue M. Lu, Zhou Fan, Theodor Misiakiewicz

TL;DR
This paper investigates the limitations of Gaussian equivalence theory in high-dimensional random feature models, especially under quadratic scaling, and introduces a Conditional Gaussian Equivalent model to accurately predict behavior when GET fails.
Contribution
It identifies scenarios where GET fails and proposes a new CGE model that captures the true asymptotics in quadratic scaling regimes for RF models.
Findings
CGE model accurately predicts RF behavior when GET fails.
GET yields incorrect predictions for low-dimensional target functions.
Derived sharp asymptotics for training and test errors.
Abstract
A major effort in modern high-dimensional statistics has been devoted to the analysis of linear predictors trained on nonlinear feature embeddings via empirical risk minimization (ERM). Gaussian equivalence theory (GET) has emerged as a powerful universality principle in this context: it states that the behavior of high-dimensional, complex features can be captured by Gaussian surrogates, which are more amenable to analysis. Despite its remarkable successes, numerical experiments show that this equivalence can fail even for simple embeddings -- such as polynomial maps -- under general scaling regimes. We investigate this breakdown in the setting of random feature (RF) models in the quadratic scaling regime, where both the number of features and the sample size grow quadratically with the data dimension. We show that when the target function depends on a low-dimensional projection of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Tensor decomposition and applications · Gaussian Processes and Bayesian Inference
