Spectral Bias and Task-Model Alignment Explain Generalization in Kernel   Regression and Infinitely Wide Neural Networks

Abdulkadir Canatar; Blake Bordelon; Cengiz Pehlevan

arXiv:2006.13198·stat.ML·February 8, 2022

Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks

Abdulkadir Canatar, Blake Bordelon, Cengiz Pehlevan

PDF

1 Repo

TL;DR

This paper develops an analytical theory for understanding how kernel regression and infinitely wide neural networks generalize, highlighting the role of task-model alignment and spectral properties in predicting performance.

Contribution

It introduces a new analytical expression for generalization error applicable to various kernels and data distributions, connecting spectral properties to generalization in neural networks.

Findings

01

Kernel eigenfunctions reveal data simplicity and task compatibility.

02

More data can impair generalization with noisy or incompatible kernels.

03

Rotation invariant kernels exhibit non-monotonic learning curves in high dimensions.

Abstract

Generalization beyond a training dataset is a main goal of machine learning, but theoretical understanding of generalization remains an open problem for many models. The need for a new theory is exacerbated by recent observations in deep neural networks where overparameterization leads to better performance, contradicting the conventional wisdom from classical statistics. In this paper, we investigate generalization error for kernel regression, which, besides being a popular machine learning method, also includes infinitely overparameterized neural networks trained with gradient descent. We use techniques from statistical mechanics to derive an analytical expression for generalization error applicable to any kernel or data distribution. We present applications of our theory to real and synthetic datasets, and for many kernels including those that arise from training deep neural networks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Pehlevan-Group/kernel-generalization
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Regression