Prevalence, Contents and Automatic Detection of KL-SATD

Leevi Rantala; Mika M\"antyl\"a; David Lo

arXiv:2008.05159·cs.SE·August 13, 2020

Prevalence, Contents and Automatic Detection of KL-SATD

Leevi Rantala, Mika M\"antyl\"a, David Lo

PDF

1 Repo

TL;DR

This paper investigates Keyword-Labeled SATD (KL-SATD) in source code comments, analyzing its prevalence, content, and developing a machine learning classifier to automatically detect and identify missing SATD keywords, aiding in technical debt management.

Contribution

It provides the first large-scale analysis of KL-SATD, compares its content to manually labeled SATD, and introduces a logistic Lasso regression model for effective automatic detection.

Findings

01

KL-SATD comments constitute about 1.52% of all comments.

02

KL-SATD comments contain words indicating code changes and uncertainty.

03

The classifier achieves an AUC-ROC of 0.88 in detecting KL-SATD.

Abstract

When developers use different keywords such as TODO and FIXME in source code comments to describe self-admitted technical debt (SATD), we refer it as Keyword-Labeled SATD (KL-SATD). We study KL-SATD from 33 software repositories with 13,588 KL-SATD comments. We find that the median percentage of KL-SATD comments among all comments is only 1,52%. We find that KL-SATD comment contents include words expressing code changes and uncertainty, such as remove, fix, maybe and probably. This makes them different compared to other comments. KL-SATD comment contents are similar to manually labeled SATD comments of prior work. Our machine learning classifier using logistic Lasso regression has good performance in detecting KL-SATD comments (AUC-ROC 0.88). Finally, we demonstrate that using machine learning we can identify comments that are currently missing but which should have a SATD keyword in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

M3SOulu/KLSATD_SEAA_2020
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.