Nondeterminism and Instability in Neural Network Optimization
Cecilia Summers, Michael J. Dinneen

TL;DR
This paper investigates how nondeterminism in neural network training causes variability in model performance, identifies instability as a key factor, and proposes methods to reduce this variability for more reliable results.
Contribution
It introduces an experimental protocol to analyze sources of nondeterminism, highlights the role of training instability, and offers two approaches to mitigate variability.
Findings
All sources of nondeterminism similarly affect model diversity.
Training instability is the primary cause of variability.
Small parameter changes can lead to vastly different models.
Abstract
Nondeterminism in neural network optimization produces uncertainty in performance, making small improvements difficult to discern from run-to-run variability. While uncertainty can be reduced by training multiple model copies, doing so is time-consuming, costly, and harms reproducibility. In this work, we establish an experimental protocol for understanding the effect of optimization nondeterminism on model diversity, allowing us to isolate the effects of a variety of sources of nondeterminism. Surprisingly, we find that all sources of nondeterminism have similar effects on measures of model diversity. To explain this intriguing fact, we identify the instability of model training, taken as an end-to-end procedure, as the key determinant. We show that even one-bit changes in initial parameters result in models converging to vastly different values. Last, we propose two approaches for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
