The Impact of Using Regression Models to Build Defect Classifiers
Gopi Krishnan Rajbahadur, Shaowei Wang, Yasutaka Kamei, Ahmed E., Hassan

TL;DR
This study compares discretized defect classifiers with regression-based classifiers across multiple datasets and models, revealing that regression approaches can outperform traditional discretized methods, especially in datasets with low defect ratios.
Contribution
The paper introduces a comparative analysis of discretized and regression-based defect classifiers, highlighting the potential advantages of regression models in defect prediction tasks.
Findings
Random forest classifiers perform best among tested models.
Discretized classifiers do not always outperform regression-based classifiers.
Regression models are particularly beneficial when defect ratios are low.
Abstract
It is common practice to discretize continuous defect counts into defective and non-defective classes and use them as a target variable when building defect classifiers (discretized classifiers). However, this discretization of continuous defect counts leads to information loss that might affect the performance and interpretation of defect classifiers. Another possible approach to build defect classifiers is through the use of regression models then discretizing the predicted defect counts into defective and non-defective classes (regression-based classifiers). In this paper, we compare the performance and interpretation of defect classifiers that are built using both approaches (i.e., discretized classifiers and regression-based classifiers) across six commonly used machine learning classifiers (i.e., linear/logistic regression, random forest, KNN, SVM, CART, and neural networks) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSupport Vector Machine
