Leveraging the Graph Structure of Neural Network Training Dynamics
Fatemeh Vahedian, Ruiyu Li, Puja Trivedi, Di Jin, Danai Koutra

TL;DR
This paper introduces a temporal graph framework that captures neural network training dynamics, enabling early performance prediction and generalization across architectures and network sizes.
Contribution
It proposes a novel, scalable graph-based method to effectively model and analyze the evolving training dynamics of deep neural networks.
Findings
Accurately predicts task performance from early training epochs.
Captures generalizable training dynamics across different network sizes.
Effective across multiple architectures and datasets.
Abstract
Understanding the training dynamics of deep neural networks (DNNs) is important as it can lead to improved training efficiency and task performance. Recent works have demonstrated that representing the wirings of static graph cannot capture how DNNs change over the course of training. Thus, in this work, we propose a compact, expressive temporal graph framework that effectively captures the dynamics of many workhorse architectures in computer vision. Specifically, it extracts an informative summary of graph properties (e.g., eigenvector centrality) over a sequence of DNN graphs obtained during training. We demonstrate that our framework captures useful dynamics by accurately predicting trained, task performance when using a summary over early training epochs (<5) across four different architectures and two image datasets. Moreover, by using a novel, highly-scalable DNN graph…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealth, Environment, Cognitive Aging · Advanced Graph Neural Networks · Human Mobility and Location-Based Analysis
Methods1x1 Convolution · Average Pooling · Bottleneck Residual Block · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Softmax · Dropout · Batch Normalization · Kaiming Initialization · Residual Connection
