On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains
Yicheng Li, Zixiong Yu, Guhan Chen, Qian Lin

TL;DR
This paper develops a method to determine the eigenvalue decay rates of neural network-related kernel functions on general domains, linking neural network training dynamics to kernel eigenstructure and their generalization properties.
Contribution
It introduces a strategy for eigenvalue decay rate analysis of kernels on general domains, including neural tangent kernels, and connects this to neural network training and generalization.
Findings
Training dynamics of wide neural networks approximate NTK regression on general domains.
Eigenvalue decay rates influence the minimax optimality of neural networks.
Overfitted neural networks do not generalize well.
Abstract
In this paper, we provide a strategy to determine the eigenvalue decay rate (EDR) of a large class of kernel functions defined on a general domain rather than . This class of kernel functions include but are not limited to the neural tangent kernel associated with neural networks with different depths and various activation functions. After proving that the dynamics of training the wide neural networks uniformly approximated that of the neural tangent kernel regression on general domains, we can further illustrate the minimax optimality of the wide neural network provided that the underground truth function , an interpolation space associated with the RKHS of NTK. We also showed that the overfitted neural network can not generalize well. We believe our approach for determining the EDR of kernels might be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
