Reducing Labeling Effort in Architecture Technical Debt Detection through Active Learning and Explainable AI
Edi Sutoyo, Paris Avgeriou, Andrea Capiluppi

TL;DR
This paper presents a method to reduce labeling effort in detecting Architecture Technical Debt by combining keyword filtering, active learning, and explainable AI, achieving high accuracy with less annotation work.
Contribution
It introduces a novel approach integrating keyword filtering, active learning, and explainable AI to efficiently detect Architecture Technical Debt with reduced manual annotation.
Findings
Active learning with Breaking Ties improves F1-score to 0.72.
Annotation effort is reduced by 49%.
LIME is preferred for explanations due to clarity.
Abstract
Self-Admitted Technical Debt (SATD) refers to technical compromises explicitly admitted by developers in natural language artifacts such as code comments, commit messages, and issue trackers. Among its types, Architecture Technical Debt (ATD) is particularly difficult to detect due to its abstract and context-dependent nature. Manual annotation of ATD is costly, time-consuming, and challenging to scale. This study focuses on reducing labeling effort in ATD detection by combining keyword-based filtering with active learning and explainable AI. We refined an existing dataset of 116 ATD-related Jira issues from prior work, producing 57 expert-validated items used to extract representative keywords. These were applied to identify over 103,000 candidate issues across ten open-source projects. To assess the reliability of this keyword-based filtering, we conducted a qualitative evaluation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Text Analysis Techniques · Topic Modeling
