On the Principle of Least Symmetry Breaking in Shallow ReLU Models
Yossi Arjevani, Michael Field

TL;DR
This paper investigates the optimization landscape of two-layer ReLU networks, revealing that SGD tends to find minima that minimally break the symmetry of the target, a principle that appears to extend beyond Gaussian inputs.
Contribution
The paper introduces the principle of least symmetry breaking as a key factor in the structure of local minima in shallow ReLU networks, supported by theoretical analysis and experiments.
Findings
Spurious local minima exhibit minimal symmetry breaking relative to target weights.
The least symmetry breaking principle applies across various input distributions and network configurations.
Experimental results support the broader applicability of the principle beyond Gaussian inputs.
Abstract
We consider the optimization problem associated with fitting two-layer ReLU networks with respect to the squared loss, where labels are assumed to be generated by a target network. Focusing first on standard Gaussian inputs, we show that the structure of spurious local minima detected by stochastic gradient descent (SGD) is, in a well-defined sense, the \emph{least loss of symmetry} with respect to the target weights. A closer look at the analysis indicates that this principle of least symmetry breaking may apply to a broader range of settings. Motivated by this, we conduct a series of experiments which corroborate this hypothesis for different classes of non-isotropic non-product distributions, smooth activation functions and networks with a few layers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods
Methods*Communicated@Fast*How Do I Communicate to Expedia?
