A new Uncertainty Principle in Machine Learning
V.Dolotin, A.Morozov

TL;DR
This paper introduces a novel uncertainty principle in machine learning, linking the sharpness of minima to the smoothness of the loss landscape, and explores its implications for polynomial approximations and neural network training.
Contribution
It formulates a new uncertainty principle specific to machine learning, extending classical Fourier analysis concepts to sigmoid functions and explaining training difficulties.
Findings
Identifies a fundamental trade-off between minimum sharpness and landscape smoothness.
Links the uncertainty principle to the degeneracy of sigmoid expansions.
Provides insights into the limitations of standard training methods.
Abstract
Many scientific problems in the context of machine learning can be reduced to the search of polynomial answers in appropriate variables. The Hevisidization of arbitrary polynomial is actually provided by one-and-the same two-layer expression. What prevents the use of this simple idea is the fatal degeneracy of the Heaviside and sigmoid expansions, which traps the steepest-descent evolution at the bottom of canyons, close to the starting point, but far from the desired true minimum. This problem is unavoidable and can be formulated as a peculiar uncertainty principle -- the sharper the minimum, the smoother the canyons. It is a direct analogue of the usual one, which is the pertinent property of the more familiar Fourier expansion. Standard machine learning software fights with this problem empirically, for example, by testing evolutions, originated at randomly distributed starting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical and Theoretical Analysis · Computability, Logic, AI Algorithms · Gaussian Processes and Bayesian Inference
