Predicting Defective Lines Using a Model-Agnostic Technique

Supatsara Wattanakriengkrai; Patanamon Thongtanunam; Chakkrit; Tantithamthavorn; Hideaki Hata; and Kenichi Matsumoto

arXiv:2009.03612·cs.SE·September 17, 2020

Predicting Defective Lines Using a Model-Agnostic Technique

Supatsara Wattanakriengkrai, Patanamon Thongtanunam, Chakkrit, Tantithamthavorn, Hideaki Hata, and Kenichi Matsumoto

PDF

TL;DR

This paper introduces LINE-DP, a model-agnostic, explainable AI framework that accurately predicts defective lines in source code, significantly reducing inspection effort and computational cost.

Contribution

The novel LINE-DP framework combines file-level defect modeling with LIME-based risky token identification to locate defective lines, improving precision over baseline methods.

Findings

01

Achieves an average recall of 0.61 in defect detection.

02

Reduces false alarm rate to 0.47.

03

Identifies 63% of defective lines related to common defects.

Abstract

Defect prediction models are proposed to help a team prioritize source code areas files that need Software QualityAssurance (SQA) based on the likelihood of having defects. However, developers may waste their unnecessary effort on the whole filewhile only a small fraction of its source code lines are defective. Indeed, we find that as little as 1%-3% of lines of a file are defective. Hence, in this work, we propose a novel framework (called LINE-DP) to identify defective lines using a model-agnostic technique, i.e., an Explainable AI technique that provides information why the model makes such a prediction. Broadly speaking, our LINE-DP first builds a file-level defect model using code token features. Then, our LINE-DP uses a state-of-the-art model-agnostic technique (i.e.,LIME) to identify risky tokens, i.e., code tokens that lead the file-level defect model to predict that the file…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.