Exponentially vanishing sub-optimal local minima in multilayer neural networks
Daniel Soudry, Elad Hoffer

TL;DR
This paper proves that in high-dimensional multilayer neural networks, sub-optimal local minima are exponentially rare compared to global minima, especially as the number of data points grows, with supporting numerical experiments.
Contribution
It provides a rigorous proof that sub-optimal local minima are exponentially unlikely in large neural networks under realistic conditions, improving understanding of loss landscape geometry.
Findings
Sub-optimal local minima volume vanishes exponentially with data size.
High probability of global minima presence in large networks.
Numerical validation on CIFAR dataset with few hidden neurons.
Abstract
Background: Statistical mechanics results (Dauphin et al. (2014); Choromanska et al. (2015)) suggest that local minima with high error are exponentially rare in high dimensions. However, to prove low error guarantees for Multilayer Neural Networks (MNNs), previous works so far required either a heavily modified MNN model or training method, strong assumptions on the labels (e.g., "near" linear separability), or an unrealistic hidden layer with units. Results: We examine a MNN with one hidden layer of piecewise linear units, a single output, and a quadratic loss. We prove that, with high probability in the limit of datapoints, the volume of differentiable regions of the empiric loss containing sub-optimal differentiable local minima is exponentially vanishing in comparison with the same volume of global minima, given standard normal input of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Stochastic Gradient Optimization Techniques
