Identifying Technical Debt and Its Types Across Diverse Software Projects Issues
Karthik Shivashankar, Mili Orucevic, Maren Maritsdatter Kruke, Antonio, Martini

TL;DR
This paper presents a transformer-based approach for accurately identifying and classifying various types of technical debt in large-scale software projects, emphasizing the importance of project-specific models and efficient classifiers.
Contribution
It introduces an ensemble of binary classifiers with in-project fine-tuning and demonstrates the effectiveness of smaller models like DistilRoBERTa for technical debt detection across diverse datasets.
Findings
In-project fine-tuning improves TD classification accuracy.
Binary classifiers outperform multi-class models for TD types.
DistilRoBERTa is more effective than larger models like GPTs.
Abstract
Technical Debt (TD) identification in software projects issues is crucial for maintaining code quality, reducing long-term maintenance costs, and improving overall project health. This study advances TD classification using transformer-based models, addressing the critical need for accurate and efficient TD identification in large-scale software development. Our methodology employs multiple binary classifiers for TD and its type, combined through ensemble learning, to enhance accuracy and robustness in detecting various forms of TD. We train and evaluate these models on a comprehensive dataset from GitHub Archive Issues (2015-2024), supplemented with industrial data validation. We demonstrate that in-project fine-tuned transformer models significantly outperform task-specific fine-tuned models in TD classification, highlighting the importance of project-specific context in accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Open Source Software Innovations · Software Engineering Techniques and Practices
