Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape

Ioannis Bantzis; James B. Simon; Arthur Jacot

arXiv:2505.21722·cs.LG·April 21, 2026

Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape

Ioannis Bantzis, James B. Simon, Arthur Jacot

PDF

1 Video

TL;DR

This paper investigates how gradient descent escapes from the initial saddle at the origin in deep ReLU networks, revealing a low-rank bias in the escape directions and saddle-to-saddle dynamics.

Contribution

It introduces the concept of saddle-to-saddle dynamics in deep ReLU networks and characterizes the low-rank bias in the escape directions from the origin saddle.

Findings

01

Optimal escape directions have a low-rank bias in deeper layers.

02

The first singular value of each layer's weight matrix is significantly larger than others.

03

Deep ReLU networks exhibit a sequence of saddle points with increasing bottleneck rank.

Abstract

When a deep ReLU network is initialized with small weights, gradient descent (GD) is at first dominated by the saddle at the origin in parameter space. We study the so-called escape directions along which GD leaves the origin, which play a similar role as the eigenvectors of the Hessian for strict saddles. We show that the optimal escape direction features a low-rank bias in its deeper layers: the first singular value of the $ℓ$ -th layer weight matrix is at least $ℓ^{\frac{1}{4}}$ larger than any other singular value. We also prove a number of related results about these escape directions. We suggest that deep ReLU networks exhibit saddle-to-saddle dynamics, with GD visiting a sequence of saddles with increasing bottleneck rank (Jacot, 2023).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape· slideslive