Feature Sets in Just-in-Time Defect Prediction: An Empirical Evaluation

Peter Bludau; Alexander Pretschner

arXiv:2209.13978·cs.SE·September 29, 2022

Feature Sets in Just-in-Time Defect Prediction: An Empirical Evaluation

Peter Bludau, Alexander Pretschner

PDF

TL;DR

This paper empirically evaluates various feature sets for just-in-time defect prediction, introduces two new feature sets, and demonstrates improved prediction performance and defect detection efficiency.

Contribution

It introduces two novel feature sets for defect prediction and shows that combining all feature sets significantly enhances model performance.

Findings

01

Combining all feature sets improves MCC by 21%.

02

Effort-aware prediction identifies 14% more defects inspecting 20% of lines.

03

Proposed features outperform existing approaches.

Abstract

Just-in-time defect prediction assigns a defect risk to each new change to a software repository in order to prioritize review and testing efforts. Over the last decades different approaches were proposed in literature to craft more accurate prediction models. However, defect prediction is still not widely used in industry, due to predictions with varying performance. In this study, we evaluate existing features on six open-source projects and propose two new features sets, not yet discussed in literature. By combining all feature sets, we improve MCC by on average 21%, leading to the best performing models when compared to state-of-the-art approaches. We also evaluate effort-awareness and find that on average 14% more defects can be identified, inspecting 20% of changed lines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.