On the exact computation of linear frequency principle dynamics and its generalization
Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

TL;DR
This paper derives an exact differential equation describing how neural networks learn different frequency components during training, revealing the dynamics and generalization properties related to frequency evolution.
Contribution
It introduces the Linear Frequency-Principle (LFP) model for infinite-width two-layer neural networks, providing exact analysis of frequency dynamics during training.
Findings
Higher frequencies evolve slower than lower frequencies depending on activation smoothness.
LFP model minimizes a Frequency-Principle norm that penalizes higher frequencies.
Generalization error is bounded by the FP-norm of the target function.
Abstract
Recent works show an intriguing phenomenon of Frequency Principle (F-Principle) that deep neural networks (DNNs) fit the target function from low to high frequency during the training, which provides insight into the training and generalization behavior of DNNs in complex tasks. In this paper, through analysis of an infinite-width two-layer NN in the neural tangent kernel (NTK) regime, we derive the exact differential equation, namely Linear Frequency-Principle (LFP) model, governing the evolution of NN output function in the frequency domain during the training. Our exact computation applies for general activation functions with no assumption on size and distribution of training data. This LFP model unravels that higher frequencies evolve polynomially or exponentially slower than lower frequencies depending on the smoothness/regularity of the activation function. We further bridge the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Machine Learning and ELM
