The Impact of Automated Parameter Optimization on Defect Prediction Models
Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, Kenichi, Matsumoto

TL;DR
Automated parameter optimization significantly enhances defect prediction models by improving performance, stability, and variable importance rankings, with minimal additional computational cost, especially for sensitive classifiers.
Contribution
This study demonstrates the substantial impact of automated parameter optimization on defect prediction models, highlighting its benefits and transferability across datasets.
Findings
Performance improvement up to 40 percentage points in AUC.
Optimized classifiers are as stable as default ones.
Sensitive parameters can be transferred across datasets.
Abstract
Defect prediction models---classifiers that identify defect-prone software modules---have configurable parameters that control their characteristics (e.g., the number of trees in a random forest). Recent studies show that these classifiers underperform when default settings are used. In this paper, we study the impact of automated parameter optimization on defect prediction models. Through a case study of 18 datasets, we find that automated parameter optimization: (1) improves AUC performance by up to 40 percentage points; (2) yields classifiers that are at least as stable as those trained using default settings; (3) substantially shifts the importance ranking of variables, with as few as 28% of the top-ranked variables in optimized classifiers also being top-ranked in non-optimized classifiers; (4) yields optimized settings for 17 of the 20 most sensitive parameters that transfer among…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software System Performance and Reliability
