SETOL: A Semi-Empirical Theory of (Deep) Learning
Charles H Martin, Christopher Hinrichs

TL;DR
SETOL offers a semi-empirical theoretical framework explaining the success of modern neural networks by linking spectral properties of weight matrices to learning quality, using tools from statistical mechanics and random matrix theory.
Contribution
It introduces a novel semi-empirical theory of deep learning that connects spectral layer metrics with neural network performance, validated on simple and state-of-the-art models.
Findings
Spectral density metrics predict test accuracy trends.
SETOL's ERG metric aligns with heavy-tailed layer quality metrics.
The theory accurately describes layer qualities in SOTA neural networks.
Abstract
We present a SemiEmpirical Theory of Learning (SETOL) that explains the remarkable performance of State-Of-The-Art (SOTA) Neural Networks (NNs). We provide a formal explanation of the origin of the fundamental quantities in the phenomenological theory of Heavy-Tailed Self-Regularization (HTSR): the heavy-tailed power-law layer quality metrics, alpha and alpha-hat. In prior work, these metrics have been shown to predict trends in the test accuracies of pretrained SOTA NN models, importantly, without needing access to either testing or training data. Our SETOL uses techniques from statistical mechanics as well as advanced methods from random matrix theory and quantum chemistry. The derivation suggests new mathematical preconditions for ideal learning, including a new metric, ERG, which is equivalent to applying a single step of the Wilson Exact Renormalization Group. We test the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
