Understanding Layer-wise Contributions in Deep Neural Networks through Spectral Analysis
Yatin Dandi, Arthur Jacot

TL;DR
This paper investigates how different layers in deep neural networks contribute to learning functions of varying frequencies, revealing that initial layers tend to focus on high-frequency components, with empirical validation in high-dimensional data.
Contribution
It introduces a spectral analysis framework to understand layer-wise contributions in DNNs, linking spectral bias to generalization error reduction, supported by theoretical proofs and empirical results.
Findings
Initial layers have a bias towards high-frequency functions.
Spectral bias varies across layers and impacts generalization.
Empirical validation confirms theoretical predictions.
Abstract
Spectral analysis is a powerful tool, decomposing any function into simpler parts. In machine learning, Mercer's theorem generalizes this idea, providing for any kernel and input distribution a natural basis of functions of increasing frequency. More recently, several works have extended this analysis to deep neural networks through the framework of Neural Tangent Kernel. In this work, we analyze the layer-wise spectral bias of Deep Neural Networks and relate it to the contributions of different layers in the reduction of generalization error for a given target function. We utilize the properties of Hermite polynomials and Spherical Harmonics to prove that initial layers exhibit a larger bias towards high-frequency functions defined on the unit sphere. We further provide empirical results validating our theory in high dimensional datasets for Deep Neural Networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Model Reduction and Neural Networks
