Feature Sets in Just-in-Time Defect Prediction: An Empirical Evaluation
Peter Bludau, Alexander Pretschner

TL;DR
This paper empirically evaluates various feature sets for just-in-time defect prediction, introduces two new feature sets, and demonstrates improved prediction performance and defect detection efficiency.
Contribution
It introduces two novel feature sets for defect prediction and shows that combining all feature sets significantly enhances model performance.
Findings
Combining all feature sets improves MCC by 21%.
Effort-aware prediction identifies 14% more defects inspecting 20% of lines.
Proposed features outperform existing approaches.
Abstract
Just-in-time defect prediction assigns a defect risk to each new change to a software repository in order to prioritize review and testing efforts. Over the last decades different approaches were proposed in literature to craft more accurate prediction models. However, defect prediction is still not widely used in industry, due to predictions with varying performance. In this study, we evaluate existing features on six open-source projects and propose two new features sets, not yet discussed in literature. By combining all feature sets, we improve MCC by on average 21%, leading to the best performing models when compared to state-of-the-art approaches. We also evaluate effort-awareness and find that on average 14% more defects can be identified, inspecting 20% of changed lines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
