Towards Cross-Project Defect Prediction with Imbalanced Feature Sets
Peng He, Bing Li, and Yutao Ma

TL;DR
This paper introduces a new approach for cross-project defect prediction with imbalanced feature sets, demonstrating its effectiveness across multiple datasets and showing that hybrid models can enhance prediction accuracy.
Contribution
It proposes a distribution characteristic-based instance mapping method for CPDP with imbalanced feature sets, addressing a key limitation of traditional CPDP methods.
Findings
The proposed method is validated on three public defect datasets.
Hybrid models combining CPDP and CPDP-IFS improve prediction performance.
Empirical results support the effectiveness of the new approach.
Abstract
Cross-project defect prediction (CPDP) has been deemed as an emerging technology of software quality assurance, especially in new or inactive projects, and a few improved methods have been proposed to support better defect prediction. However, the regular CPDP always assumes that the features of training and test data are all identical. Hence, very little is known about whether the method for CPDP with imbalanced feature sets (CPDP-IFS) works well. Considering the diversity of defect data sets available on the Internet as well as the high cost of labeling data, to address the issue, in this paper we proposed a simple approach according to a distribution characteristic-based instance (object class) mapping, and demonstrated the validity of our method based on three public defect data sets (i.e., PROMISE, ReLink and AEEEM). Besides, the empirical results indicate that the hybrid model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Imbalanced Data Classification Techniques
