Detecting Discussions of Technical Debt
Ipek Ozkaya, Zachary Kurtz, Robert L. Nord, Raghvinder S. Sangwan,, Satish M. Srinivasan

TL;DR
This paper uses machine learning to identify discussions of technical debt in software issue trackers, enabling better understanding and management of TD-related issues in large repositories.
Contribution
It introduces an automated classifier for detecting technical debt discussions in issue tracker tickets, based on expert-labeled data from Chromium.
Findings
Technical debt discussions occur in about 16% of Chromium issues.
The classifier can effectively identify TD-related tickets.
Automating TD detection can improve resolution practices.
Abstract
Technical debt (TD) refers to suboptimal choices during software development that achieve short-term goals at the expense of long-term quality. Although developers often informally discuss TD, the concept has not yet crystalized into a consistently applied label when describing issues in most repositories. We apply machine learning to understand developer insights into TD when discussing tickets in an issue tracker. We generate expert labels that indicate whether discussion of TD occurs in the free text associated with each ticket in a sample of more than 1,900 tickets in the Chromium issue tracker. We then use these labels to train a classifier that estimates labels for the remaining 475,000 tickets. We conclude that discussion of TD appears in about 16% of the tracked Chromium issues. If we can effectively classify TD-related issues, we can focus on what practices could be most useful…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Open Source Software Innovations · Software Engineering Techniques and Practices
