Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks
Benjamin Bowman, Guido Montufar

TL;DR
This paper analyzes how underparameterized neural networks optimize mean squared error via gradient flow, revealing a spectral bias towards eigenfunctions of an integral operator related to the NTK, and introduces the concept of damped deviations to unify understanding of network dynamics.
Contribution
It introduces the concept of damped deviations to analyze neural network training dynamics and characterizes the spectral bias in underparameterized regimes.
Findings
Neural networks learn eigenfunctions of an integral operator at rates tied to eigenvalues.
Spectral bias favors certain eigenfunctions, such as spherical harmonics for data on spheres.
Damped deviations help unify analysis of both under- and overparameterized regimes.
Abstract
We study the dynamics of a neural network in function space when optimizing the mean squared error via gradient flow. We show that in the underparameterized regime the network learns eigenfunctions of an integral operator determined by the Neural Tangent Kernel (NTK) at rates corresponding to their eigenvalues. For example, for uniformly distributed data on the sphere and rotation invariant weight distributions, the eigenfunctions of are the spherical harmonics. Our results can be understood as describing a spectral bias in the underparameterized regime. The proofs use the concept of "Damped Deviations", where deviations of the NTK matter less for eigendirections with large eigenvalues due to the occurence of a damping factor. Aside from the underparameterized regime, the damped deviations point-of-view can be used to track the dynamics of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Advanced Neuroimaging Techniques and Applications
MethodsNeural Tangent Kernel
