A Theory of Neural Tangent Kernel Alignment and Its Influence on Training
Haozhe Shan, Blake Bordelon

TL;DR
This paper provides a theoretical analysis of neural tangent kernel (NTK) alignment during neural network training, explaining how it accelerates learning and improves generalization through structural kernel changes.
Contribution
It introduces a theoretical framework for understanding kernel alignment, including models of NTK evolution and the concept of kernel specialization in multi-output networks.
Findings
Kernel alignment naturally emerges to accelerate training.
Alignment factors depend on network architecture and data structure.
Kernel specialization occurs in multi-output networks, aligning each kernel to its target.
Abstract
The training dynamics and generalization properties of neural networks (NN) can be precisely characterized in function space via the neural tangent kernel (NTK). Structural changes to the NTK during training reflect feature learning and underlie the superior performance of networks outside of the static kernel regime. In this work, we seek to theoretically understand kernel alignment, a prominent and ubiquitous structural change that aligns the NTK with the target function. We first study a toy model of kernel evolution in which the NTK evolves to accelerate training and show that alignment naturally emerges from this demand. We then study alignment mechanism in deep linear networks and two layer ReLU networks. These theories provide good qualitative descriptions of kernel alignment and specialization in practical networks and identify factors in network architecture and data structure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Model Reduction and Neural Networks
MethodsNeural Tangent Kernel
