Explicitizing an Implicit Bias of the Frequency Principle in Two-layer   Neural Networks

Yaoyu Zhang; Zhi-Qin John Xu; Tao Luo; Zheng Ma

arXiv:1905.10264·cs.LG·May 27, 2019·31 cites

Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks

Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

PDF

Open Access 1 Repo

TL;DR

This paper explicitly models the implicit bias of the Frequency Principle in two-layer neural networks, showing it as a penalty on high-frequency components, which helps understand their generalization behavior.

Contribution

It introduces a linear F-Principle dynamics model for two-layer ReLU networks and explicitly formulates the implicit bias as a penalization of high frequencies, advancing theoretical understanding.

Findings

01

The LFP dynamics accurately predicts learning outcomes.

02

The implicit bias is equivalent to minimizing an FP-norm with high frequencies penalized.

03

Higher FP-norm of the target increases generalization error.

Abstract

It remains a puzzle that why deep neural networks (DNNs), with more parameters than samples, often generalize well. An attempt of understanding this puzzle is to discover implicit biases underlying the training process of DNNs, such as the Frequency Principle (F-Principle), i.e., DNNs often fit target functions from low to high frequencies. Inspired by the F-Principle, we propose an effective model of linear F-Principle (LFP) dynamics which accurately predicts the learning results of two-layer ReLU neural networks (NNs) of large widths. This LFP dynamics is rationalized by a linearized mean field residual dynamics of NNs. Importantly, the long-time limit solution of this LFP dynamics is equivalent to the solution of a constrained optimization problem explicitly minimizing an FP-norm, in which higher frequencies of feasible solutions are more heavily penalized. Using this optimization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xuzhiqin1990/F-Principle
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Face and Expression Recognition

Methods*Communicated@Fast*How Do I Communicate to Expedia?