The Impact of Dormant Defects on Defect Prediction: a Study of 19 Apache   Projects

Davide Falessi; Aalok Ahluwalia; Massimiliano Di Penta

arXiv:2105.12372·cs.SE·May 27, 2021

The Impact of Dormant Defects on Defect Prediction: a Study of 19 Apache Projects

Davide Falessi, Aalok Ahluwalia, Massimiliano Di Penta

PDF

TL;DR

This study examines how dormant defects, discovered long after their introduction, affect defect prediction accuracy in open source projects and proposes data filtering as a mitigation strategy.

Contribution

It analyzes the impact of dormant defects on classifier accuracy and evaluates the effectiveness of removing recent non-defective data to improve predictions.

Findings

01

Dormant defects reduce recall of defect classifiers.

02

Removing recent non-defective data improves classifier accuracy.

03

Mitigating dormant defects enhances defect dataset quality.

Abstract

Defect prediction models can be beneficial to prioritize testing, analysis, or code review activities, and has been the subject of a substantial effort in academia, and some applications in industrial contexts. A necessary precondition when creating a defect prediction model is the availability of defect data from the history of projects. If this data is noisy, the resulting defect prediction model could result to be unreliable. One of the causes of noise for defect datasets is the presence of "dormant defects", i.e., of defects discovered several releases after their introduction. This can cause a class to be labeled as defect-free while it is not, and is, therefore "snoring". In this paper, we investigate the impact of snoring on classifiers' accuracy and the effectiveness of a possible countermeasure, i.e., dropping too recent data from a training set. We analyze the accuracy of 15…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.