A Revision of Neural Tangent Kernel-based Approaches for Neural Networks

Kyung-Su Kim; Aur\'elie C. Lozano; Eunho Yang

arXiv:2007.00884·cs.LG·August 10, 2020

A Revision of Neural Tangent Kernel-based Approaches for Neural Networks

Kyung-Su Kim, Aur\'elie C. Lozano, Eunho Yang

PDF

Open Access

TL;DR

This paper revises neural tangent kernel (NTK)-based approaches for neural networks, addressing issues with network scaling and validating key theoretical results, while proposing tighter bounds for better understanding of network generalization and training.

Contribution

It identifies and corrects errors in NTK-based theoretical results related to network scaling, providing tighter bounds and validation of previous findings.

Findings

01

NTK-based bounds are invalid when the scaling factor decreases with sample size

02

Tighter bounds remove the influence of the scaling factor, restoring validity

03

Derived a kernel that outperforms existing methods in few-shot learning

Abstract

Recent theoretical works based on the neural tangent kernel (NTK) have shed light on the optimization and generalization of over-parameterized networks, and partially bridge the gap between their practical success and classical learning theory. Especially, using the NTK-based approach, the following three representative results were obtained: (1) A training error bound was derived to show that networks can fit any finite training sample perfectly by reflecting a tighter characterization of training speed depending on the data complexity. (2) A generalization error bound invariant of network size was derived by using a data-dependent complexity measure (CMD). It follows from this CMD bound that networks can generalize arbitrary smooth functions. (3) A simple and analytic kernel function was derived as indeed equivalent to a fully-trained network. This kernel outperforms its corresponding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Neural Networks and Applications · Stochastic Gradient Optimization Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings