Randomness In Neural Network Training: Characterizing The Impact of Tooling
Donglin Zhuang, Xingyao Zhang, Shuaiwen Leon Song, Sara Hooker

TL;DR
This paper investigates how tooling choices in neural network training introduce randomness, affecting model performance on specific data parts, with significant costs for determinism across hardware and architectures.
Contribution
It provides a comprehensive large-scale analysis of tooling-induced non-determinism in neural network training across diverse hardware and architectures.
Findings
Non-determinism impacts model performance on data distribution parts.
Deterministic tooling is crucial for AI safety.
Cost of ensuring determinism varies greatly between hardware and architectures.
Abstract
The quest for determinism in machine learning has disproportionately focused on characterizing the impact of noise introduced by algorithmic design choices. In this work, we address a less well understood and studied question: how does our choice of tooling introduce randomness to deep neural network training. We conduct large scale experiments across different types of hardware, accelerators, state of art networks, and open-source datasets, to characterize how tooling choices contribute to the level of non-determinism in a system, the impact of said non-determinism, and the cost of eliminating different sources of noise. Our findings are surprising, and suggest that the impact of non-determinism in nuanced. While top-line metrics such as top-1 accuracy are not noticeably impacted, model performance on certain parts of the data distribution is far more sensitive to the introduction of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Explainable Artificial Intelligence (XAI)
