Nonlinear Dynamics In Optimization Landscape of Shallow Neural Networks with Tunable Leaky ReLU
Jingzhou Liu

TL;DR
This paper analyzes the nonlinear dynamics and bifurcations in the optimization landscape of shallow neural networks with leaky ReLU activation, revealing width-independent symmetry-breaking phenomena.
Contribution
It introduces a theoretical framework based on equivariant gradient degree to detect bifurcations in shallow networks with leaky ReLU, applicable for any number of neurons k≥4.
Findings
Bifurcations occur at the critical number 0, independent of network width.
Bifurcations are width-independent and only occur for nonnegative leaky parameters.
Global minimum remains stable without symmetry-breaking in the range (0,1).
Abstract
In this work, we study the nonlinear dynamics of a shallow neural network trained with mean-squared loss and leaky ReLU activation. Under Gaussian inputs and equal layer width k, (1) we establish, based on the equivariant gradient degree, a theoretical framework, applicable to any number of neurons k>= 4, to detect bifurcation of critical points with associated symmetries from global minimum as leaky parameter varies. Typically, our analysis reveals that a multi-mode degeneracy consistently occurs at the critical number 0, independent of k. (2) As a by-product, we further show that such bifurcations are width-independent, arise only for nonnegative and that the global minimum undergoes no further symmetry-breaking instability throughout the engineering regime in range (0,1). An explicit example with k=5 is presented to illustrate the framework and exhibit the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
