MLPs at the EOC: Concentration of the NTK
D\'avid Terj\'ek, Diego Gonz\'alez-S\'anchez

TL;DR
This paper analyzes the concentration of the Neural Tangent Kernel (NTK) in multilayer perceptrons initialized at the Edge Of Chaos, showing finite-width conditions for the NTK to approximate its infinite-width limit without relying on gradient independence.
Contribution
It proves that the NTK concentrates around its limit at finite width for MLPs with specific activation functions, without assuming linear overparameterization, and identifies quadratic hidden layer width growth as sufficient.
Findings
NTK concentrates around its infinite-width limit at finite width.
Activation functions with certain parameters improve NTK concentration.
Quadratic growth in hidden layer widths ensures accurate NTK approximation.
Abstract
We study the concentration of the Neural Tangent Kernel (NTK) of -layer Multilayer Perceptrons (MLPs) equipped with activation functions for some with the parameter being initialized at the Edge Of Chaos (EOC). Without relying on the gradient independence assumption that has only been shown to hold asymptotically in the infinitely wide limit, we prove that an approximate version of gradient independence holds at finite width. Showing that the NTK entries for over a dataset concentrate simultaneously via maximal inequalities, we prove that the NTK matrix $K(\theta) =…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Pathology Studies
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Neural Tangent Kernel
