Identifying Self-Admitted Technical Debt in Issue Tracking Systems using Machine Learning
Yikun Li, Mohamed Soliman, Paris Avgeriou

TL;DR
This paper presents a machine learning approach to automatically identify self-admitted technical debt in issue tracking systems, expanding beyond code comments to improve detection accuracy and understanding.
Contribution
It introduces a novel dataset from multiple open-source projects and develops an optimized machine learning method for SATD detection in issues.
Findings
Our approach significantly outperforms baseline methods in F1-score.
Knowledge transfer from related datasets enhances prediction performance.
SATD keywords are intuitive and indicate types and indicators of technical debt.
Abstract
Technical debt is a metaphor indicating sub-optimal solutions implemented for short-term benefits by sacrificing the long-term maintainability and evolvability of software. A special type of technical debt is explicitly admitted by software engineers (e.g. using a TODO comment); this is called Self-Admitted Technical Debt or SATD. Most work on automatically identifying SATD focuses on source code comments. In addition to source code comments, issue tracking systems have shown to be another rich source of SATD, but there are no approaches specifically for automatically identifying SATD in issues. In this paper, we first create a training dataset by collecting and manually analyzing 4,200 issues (that break down to 23,180 sections of issues) from seven open-source projects (i.e., Camel, Chromium, Gerrit, Hadoop, HBase, Impala, and Thrift) using two popular issue tracking systems (i.e.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Software System Performance and Reliability
