Appearance of Random Matrix Theory in Deep Learning

Nicholas P Baskerville; Diego Granziol; Jonathan P Keating

arXiv:2102.06740·cs.LG·December 28, 2021

Appearance of Random Matrix Theory in Deep Learning

Nicholas P Baskerville, Diego Granziol, Jonathan P Keating

PDF

1 Repo

TL;DR

This paper reveals that the spectral properties of neural network loss surfaces align with Random Matrix Theory, offering new insights into their structure and implications for optimization and performance.

Contribution

It introduces a novel model for neural network loss surfaces based on Random Matrix Theory, explaining spectral features and their impact on optimization.

Findings

01

Spectral statistics match Gaussian Orthogonal Ensemble predictions

02

Loss surface models account for rank degeneracy and outliers

03

Hardness of optimization affects achieving state-of-the-art results

Abstract

We investigate the local spectral statistics of the loss surface Hessians of artificial neural networks, where we discover excellent agreement with Gaussian Orthogonal Ensemble statistics across several network architectures and datasets. These results shed new light on the applicability of Random Matrix Theory to modelling neural networks and suggest a previously unrecognised role for it in the study of loss surfaces in deep learning. Inspired by these observations, we propose a novel model for the true loss surfaces of neural networks, consistent with our observations, which allows for Hessian spectral densities with rank degeneracy and outliers, extensively observed in practice, and predicts a growing independence of loss gradients as a function of distance in weight-space. We further investigate the importance of the true loss surface in neural networks and find, in contrast to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

npbaskerville/dnn-rmt-spacings
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.