Spectral Bias Outside the Training Set for Deep Networks in the Kernel   Regime

Benjamin Bowman; Guido Montufar

arXiv:2206.02927·stat.ML·October 18, 2022·1 cites

Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime

Benjamin Bowman, Guido Montufar

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper quantifies how deep neural networks trained with finite width and data tend to learn the top eigenfunctions of the Neural Tangent Kernel across the entire input space, revealing an inherent spectral bias.

Contribution

It provides bounds on the difference between finite and infinite width network dynamics, highlighting a bias towards top eigenfunctions independent of the target function.

Findings

01

Bias depends only on architecture and input distribution

02

Width does not need to grow polynomially with data

03

Results apply to various deep architectures

Abstract

We provide quantitative bounds measuring the $L^{2}$ difference in function space between the trajectory of a finite-width network trained on finitely many samples from the idealized kernel dynamics of infinite width and infinite data. An implication of the bounds is that the network is biased to learn the top eigenfunctions of the Neural Tangent Kernel not just on the training set but over the entire input space. This bias depends on the model architecture and input distribution alone and thus does not depend on the target function which does not need to be in the RKHS of the kernel. The result is valid for deep architectures with fully connected, convolutional, and residual layers. Furthermore the width does not need to grow polynomially with the number of samples in order to obtain high probability bounds up to a stopping time. The proof exploits the low-effective-rank property of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bbowman223/deepspec
pytorchOfficial

Videos

Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Machine Learning and ELM