Noise-induced degeneration in online learning
Yuzuru Sato, Daiji Tsutsui, and Akio Fujiwara

TL;DR
This paper analyzes how noise in stochastic gradient descent causes plateau phenomena and degeneration in multi-layer perceptrons, revealing noise-induced synchronization and optimal fluctuations that influence learning stability.
Contribution
It provides a theoretical analysis of noise-induced degeneration and plateau phenomena in stochastic gradient descent for neural networks, highlighting the role of noise in stability.
Findings
Existence of attracting regions in degenerated subspaces
Plateau phenomena caused by noise-induced synchronization
Optimal fluctuations minimize escape time from degenerated states
Abstract
In order to elucidate the plateau phenomena caused by vanishing gradient, we herein analyse stability of stochastic gradient descent near degenerated subspaces in a multi-layer perceptron. In stochastic gradient descent for Fukumizu-Amari model, which is the minimal multi-layer perceptron showing non-trivial plateau phenomena, we show that (1) attracting regions exist in multiply degenerated subspaces, (2) a strong plateau phenomenon emerges as a noise-induced synchronisation, which is not observed in deterministic gradient descent, (3) an optimal fluctuation exists to minimise the escape time from the degenerated subspace. The noise-induced degeneration observed herein is expected to be found in a broad class of machine learning via neural networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsstochastic dynamics and bifurcation · Neural Networks and Applications · Neural dynamics and brain function
