# Eigenvalue distribution of the Neural Tangent Kernel in the quadratic scaling

**Authors:** Lucas Benigni, Elliot Paquette

arXiv: 2508.20036 · 2025-08-28

## TL;DR

This paper derives the asymptotic eigenvalue distribution of the neural tangent kernel for a two-layer neural network under specific high-dimensional scaling, revealing it as a free multiplicative convolution involving the Marchenko--Pastur distribution.

## Contribution

It provides a novel theoretical characterization of the eigenvalue distribution of the NTK in the quadratic scaling regime, extending understanding of neural network kernel spectra.

## Key findings

- Eigenvalue distribution described as free multiplicative convolution.
- Distribution depends on activation function and diagonal matrix D.
- Results applicable under specific high-dimensional scaling limits.

## Abstract

We compute the asymptotic eigenvalue distribution of the neural tangent kernel of a two-layer neural network under a specific scaling of dimension. Namely, if $X\in\mathbb{R}^{n\times d}$ is an i.i.d random matrix, $W\in\mathbb{R}^{d\times p}$ is an i.i.d $\mathcal{N}(0,1)$ matrix and $D\in\mathbb{R}^{p\times p}$ is a diagonal matrix with i.i.d bounded entries, we consider the matrix   \[   \mathrm{NTK}   =   \frac{1}{d}XX^\top   \odot   \frac{1}{p}   \sigma'\left(   \frac{1}{\sqrt{d}}XW   \right)D^2   \sigma'\left(   \frac{1}{\sqrt{d}}XW   \right)^\top   \]   where $\sigma'$ is a pseudo-Lipschitz function applied entrywise and under the scaling $\frac{n}{dp}\to \gamma_1$ and $\frac{p}{d}\to \gamma_2$. We describe the asymptotic distribution as the free multiplicative convolution of the Marchenko--Pastur distribution with a deterministic distribution depending on $\sigma$ and $D$.

---
Source: https://tomesphere.com/paper/2508.20036