Rod Flow: A Continuous-Time Model for Gradient Descent at the Edge of Stability
Eric Regis, Sinho Chewi

TL;DR
This paper introduces Rod Flow, a new continuous-time model for gradient descent that better captures the dynamics at the edge of stability, providing theoretical insights and practical advantages over existing models.
Contribution
Rod Flow offers a principled, physically motivated ODE approximation for GD dynamics, improving accuracy and computational efficiency over prior models like Central Flow.
Findings
Rod Flow accurately predicts the critical sharpness threshold.
It explains self-stabilization in quartic potentials.
Matches the accuracy of Central Flow for neural networks.
Abstract
How can we understand gradient-based training over non-convex landscapes? The edge of stability phenomenon, introduced in Cohen et al. (2021), indicates that the answer is not so simple: namely, gradient descent (GD) with large step sizes often diverges away from the gradient flow. In this regime, the "Central Flow", recently proposed in Cohen et al. (2025), provides an accurate ODE approximation to the GD dynamics over many architectures. In this work, we propose Rod Flow, an alternative ODE approximation, which carries the following advantages: (1) it rests on a principled derivation stemming from a physical picture of GD iterates as an extended one-dimensional object -- a "rod"; (2) it better captures GD dynamics for simple toy examples and matches the accuracy of Central Flow for representative neural network architectures, and (3) is explicit and cheap to compute. Theoretically, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Neural Networks and Reservoir Computing
