A Rod Flow Model for Adam at the Edge of Stability
Eric Regis, Sinho Chewi

TL;DR
This paper extends the rod flow model to Adam and other momentum-based optimizers, capturing their behavior at the edge of stability more accurately than previous models.
Contribution
It introduces a joint phase space extension of rod flow for Adam and related optimizers, and empirically validates improved tracking of discrete iterates at the edge of stability.
Findings
Rod flow models accurately track Adam's iterates at the edge of stability.
Extended models work for multiple optimizers including heavy ball, Nesterov, RMSProp, Adam, and NAdam.
Empirical results show significant improvement over stable flow models.
Abstract
Cohen et al. (arXiv:2207.14484) observed that adaptive gradient methods such as Adam operate at the edge of stability. While there has been significant work on continuous-time modeling of gradient descent at the edge of stability, extending these models to momentum methods remains underdeveloped. In the gradient descent setting, Regis et al. (arXiv:2602.01480) introduced rod flow, which models consecutive iterates as an extended one-dimensional object -- a "rod." Here we extend rod flow to Adam by working in the joint phase space of parameters and first moment and treating the second moment as a smooth auxiliary variable. We also develop rod flows for heavy ball momentum, Nesterov momentum, and scalar and per-component versions of RMSProp, Adam, and NAdam. For all eight optimizers, we empirically evaluate rod flow on representative machine learning architectures, where it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
