Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions
Patrick Cheridito, Arnulf Jentzen, Florian Rossmannek

TL;DR
This paper provides a complete classification of critical points in the loss landscape of shallow neural networks with one hidden layer and affine targets, revealing the absence of local maxima and the conditions for local minima.
Contribution
It offers a comprehensive analysis of critical points for shallow networks with ReLU, leaky ReLU, or quadratic activations, specifically for affine target functions.
Findings
No local maxima in the loss landscape.
Non-global local minima caused by dead ReLU neurons.
Leaky ReLU and quadratic activations do not produce dead neurons.
Abstract
In this paper, we analyze the landscape of the true loss of neural networks with one hidden layer and ReLU, leaky ReLU, or quadratic activation. In all three cases, we provide a complete classification of the critical points in the case where the target function is affine and one-dimensional. In particular, we show that there exist no local maxima and clarify the structure of saddle points. Moreover, we prove that non-global local minima can only be caused by `dead' ReLU neurons. In particular, they do not appear in the case of leaky ReLU or quadratic activation. Our approach is of a combinatorial nature and builds on a careful analysis of the different types of hidden neurons that can occur.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsHuMan(Expedia)||How do I get a human at Expedia? · *Communicated@Fast*How Do I Communicate to Expedia?
