Frequency Bias in Neural Networks for Input of Non-Uniform Density
Ronen Basri, Meirav Galun, Amnon Geifman, David Jacobs, Yoni Kasten,, Shira Kritchman

TL;DR
This paper investigates how neural networks trained on non-uniform data distributions exhibit frequency bias, affecting convergence rates depending on local data density, using NTK analysis for both shallow and deep networks.
Contribution
It analytically derives convergence times related to local data density and frequency, extending NTK analysis to non-uniform distributions and deep networks.
Findings
Convergence time depends on local density p(x) and frequency κ as O(κ^d/p(x)).
Eigenfunctions of NTK are derived for two-layer networks on the circle.
Deep networks show similar but distinct convergence behaviors compared to shallow ones.
Abstract
Recent works have partly attributed the generalization ability of over-parameterized neural networks to frequency bias -- networks trained with gradient descent on data drawn from a uniform distribution find a low frequency fit before high frequency ones. As realistic training sets are not drawn from a uniform distribution, we here use the Neural Tangent Kernel (NTK) model to explore the effect of variable density on training dynamics. Our results, which combine analytic and empirical observations, show that when learning a pure harmonic function of frequency , convergence at a point occurs in time where denotes the local density at . Specifically, for data in we analytically derive the eigenfunctions of the kernel associated with the NTK for two-layer networks. We further prove convergence results for deep,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Model Reduction and Neural Networks
MethodsNeural Tangent Kernel
