Automatic Bug Triage using Semi-Supervised Text Classification
Jifeng Xuan, He Jiang, Zhilei Ren, Jun Yan, Zhongxuan Luo

TL;DR
This paper introduces a semi-supervised text classification method for bug triage that leverages both labeled and unlabeled bug reports, improving accuracy over traditional supervised methods.
Contribution
It combines naive Bayes and expectation-maximization to effectively utilize unlabeled bug reports and incorporates developer weights to enhance bug triage performance.
Findings
Outperforms existing supervised methods in accuracy
Effectively uses unlabeled bug reports for training
Incorporates developer weights to improve classification
Abstract
In this paper, we propose a semi-supervised text classification approach for bug triage to avoid the deficiency of labeled bug reports in existing supervised approaches. This new approach combines naive Bayes classifier and expectation-maximization to take advantage of both labeled and unlabeled bug reports. This approach trains a classifier with a fraction of labeled bug reports. Then the approach iteratively labels numerous unlabeled bug reports and trains a new classifier with labels of all the bug reports. We also employ a weighted recommendation list to boost the performance by imposing the weights of multiple developers in training the classifier. Experimental results on bug reports of Eclipse show that our new approach outperforms existing supervised approaches in terms of classification accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Advanced Malware Detection Techniques
