DeepStability: A Study of Unstable Numerical Methods and Their Solutions in Deep Learning
E. Kloberdanz, K. G. Kloberdanz, W. Le

TL;DR
DeepStability investigates numerical stability issues in deep learning libraries, identifies root causes, and provides solutions, including a database of issues and fixes to improve the reliability of DL systems.
Contribution
This paper introduces the first database of numerical stability issues in deep learning, analyzing causes, manifestations, and fixes in PyTorch and TensorFlow.
Findings
Identified numerous unstable numerical methods in DL libraries.
Developed patches that improve numerical stability and were accepted by developers.
Provided a resource for future detection and prevention of numerical issues in DL.
Abstract
Deep learning (DL) has become an integral part of solutions to various important problems, which is why ensuring the quality of DL systems is essential. One of the challenges of achieving reliability and robustness of DL software is to ensure that algorithm implementations are numerically stable. DL algorithms require a large amount and a wide variety of numerical computations. A naive implementation of numerical computation can lead to errors that may result in incorrect or inaccurate learning and results. A numerical algorithm or a mathematical formula can have several implementations that are mathematically equivalent, but have different numerical stability properties. Designing numerically stable algorithm implementations is challenging, because it requires an interdisciplinary knowledge of software engineering, DL, and numerical analysis. In this paper, we study two mature DL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
