Predicting Defective Lines Using a Model-Agnostic Technique
Supatsara Wattanakriengkrai, Patanamon Thongtanunam, Chakkrit, Tantithamthavorn, Hideaki Hata, and Kenichi Matsumoto

TL;DR
This paper introduces LINE-DP, a model-agnostic, explainable AI framework that accurately predicts defective lines in source code, significantly reducing inspection effort and computational cost.
Contribution
The novel LINE-DP framework combines file-level defect modeling with LIME-based risky token identification to locate defective lines, improving precision over baseline methods.
Findings
Achieves an average recall of 0.61 in defect detection.
Reduces false alarm rate to 0.47.
Identifies 63% of defective lines related to common defects.
Abstract
Defect prediction models are proposed to help a team prioritize source code areas files that need Software QualityAssurance (SQA) based on the likelihood of having defects. However, developers may waste their unnecessary effort on the whole filewhile only a small fraction of its source code lines are defective. Indeed, we find that as little as 1%-3% of lines of a file are defective. Hence, in this work, we propose a novel framework (called LINE-DP) to identify defective lines using a model-agnostic technique, i.e., an Explainable AI technique that provides information why the model makes such a prediction. Broadly speaking, our LINE-DP first builds a file-level defect model using code token features. Then, our LINE-DP uses a state-of-the-art model-agnostic technique (i.e.,LIME) to identify risky tokens, i.e., code tokens that lead the file-level defect model to predict that the file…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
