Is the Frequency Principle always valid?

Qijia Zhai

arXiv:2508.17323·cs.LG·August 26, 2025

Is the Frequency Principle always valid?

Qijia Zhai

PDF

TL;DR

This paper examines the frequency principle in shallow ReLU neural networks on the sphere, revealing conditions under which it holds or is violated, and how trainable directions influence learning dynamics and frequency emergence.

Contribution

It provides a harmonic analysis of the frequency principle on curved domains, highlighting the impact of initial conditions and trainable directions on learning behavior.

Findings

01

Spherical harmonic coefficients decay exponentially, supporting the frequency principle.

02

The principle can be violated under specific initial conditions or error distributions.

03

Trainable directions can either preserve or accelerate high-frequency learning.

Abstract

We investigate the learning dynamics of shallow ReLU neural networks on the unit sphere \(S^2\subset\mathbb{R}^3\) in polar coordinates \((\tau,\phi)\), considering both fixed and trainable neuron directions \(\{w_i\}\). For fixed weights, spherical harmonic expansions reveal an intrinsic low-frequency preference with coefficients decaying as \(O(\ell^{5/2}/2^\ell)\), typically leading to the Frequency Principle (FP) of lower-frequency-first learning. However, this principle can be violated under specific initial conditions or error distributions. With trainable weights, an additional rotation term in the harmonic evolution equations preserves exponential decay with decay order \(O(\ell^{7/2}/2^\ell)\) factor, also leading to the FP of lower-frequency-first learning. But like fixed weights case, the principle can be violated under specific initial conditions or error distributions. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.