Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural   Networks

Blake Bordelon; Abdulkadir Canatar; Cengiz Pehlevan

arXiv:2002.02561·cs.LG·February 26, 2021·29 cites

Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks

Blake Bordelon, Abdulkadir Canatar, Cengiz Pehlevan

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper provides analytical formulas for the generalization performance of kernel regression and wide neural networks, revealing a spectral principle where models learn successively higher spectral modes as training data increases.

Contribution

It introduces a spectral decomposition approach to understanding learning dynamics in kernel methods and neural networks, linking spectral modes to training set size and data distribution.

Findings

01

Kernel methods and neural networks learn spectral modes sequentially.

02

Learning stages depend on data distribution and kernel spectral properties.

03

Theoretical predictions are validated with simulations on synthetic and real data.

Abstract

We derive analytical expressions for the generalization performance of kernel regression as a function of the number of training samples using theoretical methods from Gaussian processes and statistical physics. Our expressions apply to wide neural networks due to an equivalence between training them and kernel regression with the Neural Tangent Kernel (NTK). By computing the decomposition of the total generalization error due to different spectral components of the kernel, we identify a new spectral principle: as the size of the training set grows, kernel machines and neural networks fit successively higher spectral modes of the target function. When data are sampled from a uniform distribution on a high-dimensional hypersphere, dot product kernels, including NTK, exhibit learning stages where different frequency modes of the target function are learned. We verify our theory with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Pehlevan-Group/NTK_Learning_Curves
jaxOfficial

Videos

Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks· slideslive

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Neural Networks and Applications · Machine Learning and ELM

MethodsNeural Tangent Kernel