Towards Effective Bug Triage with Towards Effective Bug Triage with Software Data Reduction Techniques
Jifeng Xuan, He Jiang, Yan Hu, Zhilei Ren, Weiqin Zou, Zhongxuan Luo,, Xindong Wu

TL;DR
This paper presents a combined data reduction approach using instance and feature selection to improve bug triage accuracy and efficiency on large datasets from open source projects.
Contribution
It introduces a novel method for reducing bug data scale and enhancing triage quality by combining instance and feature selection based on historical data.
Findings
Data reduction improves bug triage accuracy.
Method effectively reduces data size on large datasets.
Approach applicable to open source projects like Eclipse and Mozilla.
Abstract
Software companies spend over 45 percent of cost in dealing with software bugs. An inevitable step of fixing bugs is bug triage, which aims to correctly assign a developer to a new bug. To decrease the time cost in manual work, text classification techniques are applied to conduct automatic bug triage. In this paper, we address the problem of data reduction for bug triage, i.e., how to reduce the scale and improve the quality of bug data. We combine instance selection with feature selection to simultaneously reduce data scale on the bug dimension and the word dimension. To determine the order of applying instance selection and feature selection, we extract attributes from historical bug data sets and build a predictive model for a new bug data set. We empirically investigate the performance of data reduction on totally 600,000 bug reports of two large open source projects, namely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Machine Learning and Data Classification · Software Reliability and Analysis Research
