Rod Flow: A Continuous-Time Model for Gradient Descent at the Edge of Stability

Eric Regis; Sinho Chewi

arXiv:2602.01480·cs.LG·February 3, 2026

Rod Flow: A Continuous-Time Model for Gradient Descent at the Edge of Stability

Eric Regis, Sinho Chewi

PDF

Open Access

TL;DR

This paper introduces Rod Flow, a new continuous-time model for gradient descent that better captures the dynamics at the edge of stability, providing theoretical insights and practical advantages over existing models.

Contribution

Rod Flow offers a principled, physically motivated ODE approximation for GD dynamics, improving accuracy and computational efficiency over prior models like Central Flow.

Findings

01

Rod Flow accurately predicts the critical sharpness threshold.

02

It explains self-stabilization in quartic potentials.

03

Matches the accuracy of Central Flow for neural networks.

Abstract

How can we understand gradient-based training over non-convex landscapes? The edge of stability phenomenon, introduced in Cohen et al. (2021), indicates that the answer is not so simple: namely, gradient descent (GD) with large step sizes often diverges away from the gradient flow. In this regime, the "Central Flow", recently proposed in Cohen et al. (2025), provides an accurate ODE approximation to the GD dynamics over many architectures. In this work, we propose Rod Flow, an alternative ODE approximation, which carries the following advantages: (1) it rests on a principled derivation stemming from a physical picture of GD iterates as an extended one-dimensional object -- a "rod"; (2) it better captures GD dynamics for simple toy examples and matches the accuracy of Central Flow for representative neural network architectures, and (3) is explicit and cheap to compute. Theoretically, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Neural Networks and Reservoir Computing