Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and TensorFlow
Florian Tambon, Amin Nikanjam, Le An, Foutse Khomh, Giuliano Antoniol

TL;DR
This study empirically investigates silent bugs in Keras and TensorFlow, revealing their prevalence, impact, and categorization, and offers guidelines to mitigate these hidden issues affecting deep learning applications.
Contribution
First empirical analysis of silent bugs in popular DL frameworks, categorizing their effects and impact, and proposing guidelines to prevent such bugs.
Findings
77 silent bugs identified out of 1,168 issues
Developed a categorization of silent bugs and impact levels
Survey confirmed the significant impact of silent bugs on DL development
Abstract
Deep Learning (DL) frameworks are now widely used, simplifying the creation of complex models as well as their integration to various applications even to non DL experts. However, like any other programs, they are prone to bugs. This paper deals with the subcategory of bugs named silent bugs: they lead to wrong behavior but they do not cause system crashes or hangs, nor show an error message to the user. Such bugs are even more dangerous in DL applications and frameworks due to the "black-box" and stochastic nature of the systems (the end user can not understand how the model makes decisions). This paper presents the first empirical study of Keras and TensorFlow silent bugs, and their impact on users' programs. We extracted closed issues related to Keras from the TensorFlow GitHub repository. Out of the 1,168 issues that we gathered, 77 were reproducible silent bugs affecting users'…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Software Engineering Research · Machine Learning and Data Classification
MethodsNetwork On Network
