Theory of the Frequency Principle for General Deep Neural Networks
Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

TL;DR
This paper provides a rigorous theoretical foundation for the Frequency Principle in deep neural networks, explaining how they learn from low to high frequencies across training stages, applicable to various architectures and loss functions.
Contribution
It offers the first comprehensive theoretical analysis of the F-Principle at all training stages for general DNNs with broad applicability.
Findings
The F-Principle holds at initial, intermediate, and final training stages.
Theoretical results apply to multilayer networks with general activation functions.
Provides a mathematical basis for understanding DNN training dynamics.
Abstract
Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training. The F-Principle has been very useful in providing both qualitative and quantitative understandings of DNNs. In this paper, we rigorously investigate the F-Principle for the training dynamics of a general DNN at three stages: initial stage, intermediate stage, and final stage. For each stage, a theorem is provided in terms of proper quantities characterizing the F-Principle. Our results are general in the sense that they work for multilayer networks with general activation functions, population densities of data, and a large class of loss functions. Our work lays a theoretical foundation of the F-Principle…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
