Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points,   Saddle Escape, and Network Embedding

Frank Zhengqing Wu; Berfin Simsek; Francois Gaston Ged

arXiv:2402.05626·cs.LG·March 18, 2025·1 cites

Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding

Frank Zhengqing Wu, Berfin Simsek, Francois Gaston Ged

PDF

Open Access 1 Video

TL;DR

This paper analyzes the loss landscape of shallow ReLU-like neural networks, identifying stationary points, their nature, and how network embedding affects these points, advancing understanding of training dynamics.

Contribution

It introduces the concept of directional stationary points for non-differentiable loss landscapes and characterizes their properties in shallow ReLU-like networks.

Findings

01

Stationary points without escape neurons are local minima.

02

Presence of escape neurons guarantees non-minimal stationary points.

03

Network embedding reshapes stationary points and influences training dynamics.

Abstract

In this paper, we study the loss landscape of one-hidden-layer neural networks with ReLU-like activation functions trained with the empirical squared loss using gradient descent (GD). We identify the stationary points of such networks, which significantly slow down loss decrease during training. To capture such points while accounting for the non-differentiability of the loss, the stationary points that we study are directional stationary points, rather than other notions like Clarke stationary points. We show that, if a stationary point does not contain "escape neurons", which are defined with first-order conditions, it must be a local minimum. Moreover, for the scalar-output case, the presence of an escape neuron guarantees that the stationary point is not a local minimum. Our results refine the description of the saddle-to-saddle training process starting from infinitesimally small…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding· slideslive

Taxonomy

TopicsNeural Networks and Applications