Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

Xiao Shou; Yanna Ding; Jianxi Gao

arXiv:2505.20221·cs.LG·May 27, 2025

Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

Xiao Shou, Yanna Ding, Jianxi Gao

PDF

Open Access

TL;DR

Gradient Flow Matching (GFM) models neural network training as a dynamical system, accurately forecasting weight trajectories and convergence across architectures by learning optimizer-aware vector fields.

Contribution

GFM introduces a continuous-time framework that captures optimizer update rules, enabling accurate extrapolation of training dynamics and convergence prediction.

Findings

01

GFM achieves competitive forecasting accuracy with Transformer-based models.

02

GFM outperforms LSTM and classical baselines in training trajectory prediction.

03

GFM generalizes across different neural architectures and initializations.

Abstract

Training deep neural networks remains computationally intensive due to the itera2 tive nature of gradient-based optimization. We propose Gradient Flow Matching (GFM), a continuous-time modeling framework that treats neural network training as a dynamical system governed by learned optimizer-aware vector fields. By leveraging conditional flow matching, GFM captures the underlying update rules of optimizers such as SGD, Adam, and RMSprop, enabling smooth extrapolation of weight trajectories toward convergence. Unlike black-box sequence models, GFM incorporates structural knowledge of gradient-based updates into the learning objective, facilitating accurate forecasting of final weights from partial training sequences. Empirically, GFM achieves forecasting accuracy that is competitive with Transformer-based models and significantly outperforms LSTM and other classical baselines.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Reinforcement Learning in Robotics

MethodsTanh Activation · Sigmoid Activation · Stochastic Gradient Descent · Long Short-Term Memory · Adam