Power-Law Spectrum of the Random Feature Model

Elliot Paquette; Ke Liang Xiao; Yizhe Zhu

arXiv:2603.14578·stat.ML·March 17, 2026

Power-Law Spectrum of the Random Feature Model

Elliot Paquette, Ke Liang Xiao, Yizhe Zhu

PDF

Open Access

TL;DR

This paper investigates how the spectral power-law decay of data covariance matrices is preserved or altered after passing through a random feature layer with nonlinear activation, revealing that the power-law structure is largely maintained with minor logarithmic modifications.

Contribution

It provides a rigorous characterization of the eigenvalue spectrum of the random feature covariance for data with power-law spectra, showing preservation of the spectral decay exponent under random nonlinear projections.

Findings

01

Eigenvalues follow a power-law decay with the same exponent as the input covariance.

02

Logarithmic corrections depend on the degree of the monomial activation.

03

The spectral structure is preserved up to polylogarithmic factors.

Abstract

Scaling laws for neural networks, in which the loss decays as a power-law in the number of parameters, data, and compute, depend fundamentally on the spectral structure of the data covariance, with power-law eigenvalue decay appearing ubiquitously in vision and language tasks. A central question is whether this spectral structure is preserved or destroyed when data passes through the basic building block of a neural network: a random linear projection followed by a nonlinear activation. We study this question for the random feature model: given data $x \sim N (0, H) \in R^{v}$ where $H$ has $α$ -power-law spectrum ( $λ_{j} (H) ≍ j^{- α}$ , $α > 1$ ), a Gaussian sketch matrix $W \in R^{v \times d}$ , and an entrywise monomial $f (y) = y^{p}$ , we characterize the eigenvalues of the population random-feature covariance $\mathbb{E}_{x }[\frac{1}{d}f(W^\top x…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Quantum many-body systems · Neural Networks and Applications