When does Gaussian equivalence fail and how to fix it: Non-universal behavior of random features with quadratic scaling

Garrett G. Wen; Hong Hu; Yue M. Lu; Zhou Fan; Theodor Misiakiewicz

arXiv:2512.03325·math.ST·December 4, 2025

When does Gaussian equivalence fail and how to fix it: Non-universal behavior of random features with quadratic scaling

Garrett G. Wen, Hong Hu, Yue M. Lu, Zhou Fan, Theodor Misiakiewicz

PDF

Open Access

TL;DR

This paper investigates the limitations of Gaussian equivalence theory in high-dimensional random feature models, especially under quadratic scaling, and introduces a Conditional Gaussian Equivalent model to accurately predict behavior when GET fails.

Contribution

It identifies scenarios where GET fails and proposes a new CGE model that captures the true asymptotics in quadratic scaling regimes for RF models.

Findings

01

CGE model accurately predicts RF behavior when GET fails.

02

GET yields incorrect predictions for low-dimensional target functions.

03

Derived sharp asymptotics for training and test errors.

Abstract

A major effort in modern high-dimensional statistics has been devoted to the analysis of linear predictors trained on nonlinear feature embeddings via empirical risk minimization (ERM). Gaussian equivalence theory (GET) has emerged as a powerful universality principle in this context: it states that the behavior of high-dimensional, complex features can be captured by Gaussian surrogates, which are more amenable to analysis. Despite its remarkable successes, numerical experiments show that this equivalence can fail even for simple embeddings -- such as polynomial maps -- under general scaling regimes. We investigate this breakdown in the setting of random feature (RF) models in the quadratic scaling regime, where both the number of features and the sample size grow quadratically with the data dimension. We show that when the target function depends on a low-dimensional projection of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Tensor decomposition and applications · Gaussian Processes and Bayesian Inference