# A Random Matrix Approach to Neural Networks

**Authors:** Cosme Louart, Zhenyu Liao, Romain Couillet

arXiv: 1702.05419 · 2017-06-30

## TL;DR

This paper analyzes the spectral properties of Gram random matrices in neural networks using random matrix theory, providing deterministic equivalents and insights into network performance and hyperparameter tuning.

## Contribution

It introduces a novel random matrix model for neural networks and derives deterministic equivalents for spectral measures, aiding understanding and optimization of random neural networks.

## Key findings

- Deterministic equivalents for spectral measures of neural network matrices
- Insights into asymptotic performance of single-layer random neural networks
- Practical methods for hyperparameter tuning based on spectral analysis

## Abstract

This article studies the Gram random matrix model $G=\frac1T\Sigma^{\rm T}\Sigma$, $\Sigma=\sigma(WX)$, classically found in the analysis of random feature maps and random neural networks, where $X=[x_1,\ldots,x_T]\in{\mathbb R}^{p\times T}$ is a (data) matrix of bounded norm, $W\in{\mathbb R}^{n\times p}$ is a matrix of independent zero-mean unit variance entries, and $\sigma:{\mathbb R}\to{\mathbb R}$ is a Lipschitz continuous (activation) function --- $\sigma(WX)$ being understood entry-wise. By means of a key concentration of measure lemma arising from non-asymptotic random matrix arguments, we prove that, as $n,p,T$ grow large at the same rate, the resolvent $Q=(G+\gamma I_T)^{-1}$, for $\gamma>0$, has a similar behavior as that met in sample covariance matrix models, involving notably the moment $\Phi=\frac{T}n{\mathbb E}[G]$, which provides in passing a deterministic equivalent for the empirical spectral measure of $G$. Application-wise, this result enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.05419/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1702.05419/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1702.05419/full.md

---
Source: https://tomesphere.com/paper/1702.05419