DebtFree: Minimizing Labeling Cost in Self-Admitted Technical Debt Identification using Semi-Supervised Learning
Huy Tu, Tim Menzies

TL;DR
DebtFree is a semi-supervised framework that significantly reduces labeling effort in identifying self-admitted technical debts in software comments, outperforming existing methods in effectiveness and efficiency.
Contribution
The paper introduces DebtFree, a novel semi-supervised approach that automatically pseudo-labels data and assists human experts, reducing labeling effort by up to 99% and improving SATD detection.
Findings
Reduces labeling effort by up to 99% in unlabeled data scenarios.
Achieves statistically significant improvements over state-of-the-art models.
Enhances effectiveness of SATD identification in multiple software projects.
Abstract
Keeping track of and managing Self-Admitted Technical Debts (SATDs) is important for maintaining a healthy software project. Current active-learning SATD recognition tool involves manual inspection of 24% of the test comments on average to reach 90% of the recall. Among all the test comments, about 5% are SATDs. The human experts are then required to read almost a quintuple of the SATD comments which indicates the inefficiency of the tool. Plus, human experts are still prone to error: 95% of the false-positive labels from previous work were actually true positives. To solve the above problems, we propose DebtFree, a two-mode framework based on unsupervised learning for identifying SATDs. In mode1, when the existing training data is unlabeled, DebtFree starts with an unsupervised learner to automatically pseudo-label the programming comments in the training data. In contrast, in mode2…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software Reliability and Analysis Research
