Identifying Equivalent Training Dynamics

William T. Redman; Juan M. Bello-Rivas; Maria Fonoberova; Ryan Mohr,; Ioannis G. Kevrekidis; Igor Mezi\'c

arXiv:2302.09160·cs.LG·November 1, 2024·6 cites

Identifying Equivalent Training Dynamics

William T. Redman, Juan M. Bello-Rivas, Maria Fonoberova, Ryan Mohr,, Ioannis G. Kevrekidis, Igor Mezi\'c

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel framework using Koopman operator theory to identify when deep neural network training dynamics are equivalent, enabling better understanding of different training regimes and architectures.

Contribution

We develop a method leveraging Koopman eigenvalues to determine dynamical equivalence of neural network training processes, addressing a key challenge in analyzing training dynamics.

Findings

01

Successfully identified equivalence between online mirror descent and online gradient descent.

02

Detected non-conjugate dynamics between shallow and wide neural networks.

03

Characterized early training phases and grokking phenomena in CNNs and Transformers.

Abstract

Study of the nonlinear evolution deep neural network (DNN) parameters undergo during training has uncovered regimes of distinct dynamical behavior. While a detailed understanding of these phenomena has the potential to advance improvements in training efficiency and robustness, the lack of methods for identifying when DNN models have equivalent dynamics limits the insight that can be gained from prior work. Topological conjugacy, a notion from dynamical systems theory, provides a precise definition of dynamical equivalence, offering a possible route to address this need. However, topological conjugacies have historically been challenging to compute. By leveraging advances in Koopman operator theory, we develop a framework for identifying conjugate and non-conjugate training dynamics. To validate our approach, we demonstrate that comparing Koopman eigenvalues can correctly identify a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

william-redman/identifying_equivalent_training_dynamics
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Gaussian Processes and Bayesian Inference