A Revision of Neural Tangent Kernel-based Approaches for Neural Networks
Kyung-Su Kim, Aur\'elie C. Lozano, Eunho Yang

TL;DR
This paper revises neural tangent kernel (NTK)-based approaches for neural networks, addressing issues with network scaling and validating key theoretical results, while proposing tighter bounds for better understanding of network generalization and training.
Contribution
It identifies and corrects errors in NTK-based theoretical results related to network scaling, providing tighter bounds and validation of previous findings.
Findings
NTK-based bounds are invalid when the scaling factor decreases with sample size
Tighter bounds remove the influence of the scaling factor, restoring validity
Derived a kernel that outperforms existing methods in few-shot learning
Abstract
Recent theoretical works based on the neural tangent kernel (NTK) have shed light on the optimization and generalization of over-parameterized networks, and partially bridge the gap between their practical success and classical learning theory. Especially, using the NTK-based approach, the following three representative results were obtained: (1) A training error bound was derived to show that networks can fit any finite training sample perfectly by reflecting a tighter characterization of training speed depending on the data complexity. (2) A generalization error bound invariant of network size was derived by using a data-dependent complexity measure (CMD). It follows from this CMD bound that networks can generalize arbitrary smooth functions. (3) A simple and analytic kernel function was derived as indeed equivalent to a fully-trained network. This kernel outperforms its corresponding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Neural Networks and Applications · Stochastic Gradient Optimization Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
