On the Need of Preserving Order of Data When Validating Within-Project   Defect Classifiers

Davide Falessi; Jacky Huang; Likhita Narayana; Jennifer Fong Thai,; Burak Turhan

arXiv:1809.01510·cs.SE·August 3, 2020

On the Need of Preserving Order of Data When Validating Within-Project Defect Classifiers

Davide Falessi, Jacky Huang, Likhita Narayana, Jennifer Fong Thai,, Burak Turhan

PDF

TL;DR

This paper investigates how preserving the temporal order of data affects the validation of defect classifiers, showing that different validation techniques can lead to significantly different accuracy estimates and emphasizing the importance of choosing appropriate methods.

Contribution

It provides an empirical comparison of time-series and non-time-series validation techniques for defect prediction, highlighting the impact of data order preservation on classifier accuracy measurement.

Findings

01

Walk-forward validation often yields different AUC scores compared to cross-validation and bootstrap.

02

Significant differences in classifier accuracy are observed in nearly half to over half of the cases.

03

Choosing the validation technique affects the realism and conclusions of defect prediction models.

Abstract

[Context] The use of defect prediction models, such as classifiers, can support testing resource allocations by using data of the previous releases of the same project for predicting which software components are likely to be defective. A validation technique, hereinafter technique defines a specific way to split available data in training and test sets to measure a classifier accuracy. Time-series techniques have the unique ability to preserve the temporal order of data; i.e., preventing the testing set to have data antecedent to the training set. [Aim] The aim of this paper is twofold: first we check if there is a difference in the classifiers accuracy measured by time-series versus non-time-series techniques. Afterward, we check for a possible reason for this difference, i.e., if defect rates change across releases of a project. [Method] Our method consists of measuring the accuracy,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.