On the loss landscape of a class of deep neural networks with no bad local valleys
Quynh Nguyen, Mahesh Chandra Mukkamala, Matthias Hein

TL;DR
This paper proves that a certain class of over-parameterized deep neural networks with standard activations and cross-entropy loss have no bad local valleys, ensuring optimization paths to near-zero loss from any starting point.
Contribution
It establishes a theoretical guarantee that these neural networks lack sub-optimal strict local minima, facilitating easier training.
Findings
Networks have no sub-optimal strict local minima.
Existence of continuous paths with non-increasing loss from any point.
Loss can be made arbitrarily close to zero along these paths.
Abstract
We identify a class of over-parameterized deep neural networks with standard activation functions and cross-entropy loss which provably have no bad local valley, in the sense that from any point in parameter space there exists a continuous path on which the cross-entropy loss is non-increasing and gets arbitrarily close to zero. This implies that these networks have no sub-optimal strict local minima.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Neural Networks and Applications · Stochastic Gradient Optimization Techniques
