Deep frequency principle towards understanding why deeper learning is   faster

Zhi-Qin John Xu; Hanxu Zhou

arXiv:2007.14313·cs.LG·December 22, 2020·5 cites

Deep frequency principle towards understanding why deeper learning is faster

Zhi-Qin John Xu, Hanxu Zhou

PDF

Open Access 1 Video

TL;DR

This paper uses Fourier analysis to empirically demonstrate that deeper neural networks bias towards learning lower frequency functions faster, explaining why deeper models often train more efficiently.

Contribution

It introduces the deep frequency principle, showing how depth biases neural networks towards lower frequency functions, providing a new empirical explanation for faster training in deeper networks.

Findings

01

Deeper layers bias towards lower frequency functions during training.

02

Deeper networks learn lower frequency components faster.

03

Empirical evidence supports the deep frequency principle.

Abstract

Understanding the effect of depth in deep learning is a critical problem. In this work, we utilize the Fourier analysis to empirically provide a promising mechanism to understand why feedforward deeper learning is faster. To this end, we separate a deep neural network, trained by normal stochastic gradient descent, into two parts during analysis, i.e., a pre-condition component and a learning component, in which the output of the pre-condition one is the input of the learning one. We use a filtering method to characterize the frequency distribution of a high-dimensional function. Based on experiments of deep networks and real dataset, we propose a deep frequency principle, that is, the effective target function for a deeper hidden layer biases towards lower frequency during the training. Therefore, the learning component effectively learns a lower frequency function if the pre-condition…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Deep Frequency Principle Towards Understanding Why Deeper Learning Is Faster· underline

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Adversarial Robustness in Machine Learning