An Empirical Study of Fault Localisation Techniques for Deep Learning
Nargiz Humbatova, Jinhan Kim, Gunel Jahangirova, Shin Yoo, Paolo, Tonella

TL;DR
This study evaluates and compares various fault localisation techniques for deep neural networks, revealing the impact of ground truth selection and highlighting extsc{Dfd} as the most effective method based on benchmark results.
Contribution
It provides a comprehensive empirical comparison of state-of-the-art DNN fault localisation techniques using real and mutated faults, highlighting the importance of ground truth selection.
Findings
Single ground truth evaluation yields low performance metrics.
Alternative equivalent patches improve localisation effectiveness.
extsc{Dfd} outperforms other techniques with higher recall and precision.
Abstract
With the increased popularity of Deep Neural Networks (DNNs), increases also the need for tools to assist developers in the DNN implementation, testing and debugging process. Several approaches have been proposed that automatically analyse and localise potential faults in DNNs under test. In this work, we evaluate and compare existing state-of-the-art fault localisation techniques, which operate based on both dynamic and static analysis of the DNN. The evaluation is performed on a benchmark consisting of both real faults obtained from bug reporting platforms and faulty models produced by a mutation tool. Our findings indicate that the usage of a single, specific ground truth (e.g., the human defined one) for the evaluation of DNN fault localisation tools results in pretty low performance (maximum average recall of 0.31 and precision of 0.23). However, such figures increase when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Software System Performance and Reliability · Fault Detection and Control Systems
