TL;DR
DeepFD is a learning-based framework that automatically diagnoses and localizes faults in deep learning programs by analyzing runtime features during training, significantly improving accuracy over existing rule-based methods.
Contribution
DeepFD introduces a novel learning approach for fault diagnosis and localization in DL programs, moving beyond neuron-focused and rule-based methods to identify root causes effectively.
Findings
Correctly diagnoses 52% of faulty DL programs
Locates faults in 42% of faulty programs
Outperforms state-of-the-art methods in accuracy
Abstract
As Deep Learning (DL) systems are widely deployed for mission-critical applications, debugging such systems becomes essential. Most existing works identify and repair suspicious neurons on the trained Deep Neural Network (DNN), which, unfortunately, might be a detour. Specifically, several existing studies have reported that many unsatisfactory behaviors are actually originated from the faults residing in DL programs. Besides, locating faulty neurons is not actionable for developers, while locating the faulty statements in DL programs can provide developers with more useful information for debugging. Though a few recent studies were proposed to pinpoint the faulty statements in DL programs or the training settings (e.g. too large learning rate), they were mainly designed based on predefined rules, leading to many false alarms or false negatives, especially when the faults are beyond…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRepair
