Gradient Descent Fails to Learn High-frequency Functions and Modular Arithmetic
Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhenisbek, Assylbekov

TL;DR
This paper demonstrates that gradient-based learning methods struggle to learn high-frequency periodic functions and modular arithmetic functions due to vanishing gradient variance, revealing fundamental limitations in neural network training for these classes.
Contribution
The paper provides a mathematical analysis showing the failure of gradient descent to learn high-frequency and modular functions, highlighting inherent limitations in current training methods.
Findings
Gradient variance is negligibly small for high-frequency functions.
Gradient descent fails to learn modular multiplication functions.
Limitations are pronounced when frequency or prime base p is large.
Abstract
Classes of target functions containing a large number of approximately orthogonal elements are known to be hard to learn by the Statistical Query algorithms. Recently this classical fact re-emerged in a theory of gradient-based optimization of neural networks. In the novel framework, the hardness of a class is usually quantified by the variance of the gradient with respect to a random choice of a target function. A set of functions of the form , where is taken from , has attracted some attention from deep learning theorists and cryptographers recently. This class can be understood as a subset of -periodic functions on and is tightly connected with a class of high-frequency periodic functions on the real line. We present a mathematical analysis of limitations and challenges associated with using gradient-based learning techniques…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Statistical Methods and Inference · Mathematical Approximation and Integration
MethodsSparse Evolutionary Training · Balanced Selection
