An Empirical Study on the Effectiveness of Large Language Models for   SATD Identification and Classification

Mohammad Sadegh Sheikhaei; Yuan Tian; Shaowei Wang; Bowen Xu

arXiv:2405.06806·cs.SE·May 14, 2024

An Empirical Study on the Effectiveness of Large Language Models for SATD Identification and Classification

Mohammad Sadegh Sheikhaei, Yuan Tian, Shaowei Wang, Bowen Xu

PDF

Open Access 1 Repo

TL;DR

This study evaluates the effectiveness of large language models, especially Flan-T5 variants, in identifying and classifying Self-Admitted Technical Debt in software, showing they outperform traditional models in some settings but have limitations in others.

Contribution

It provides a comprehensive empirical analysis of LLMs for SATD tasks, comparing fine-tuning and in-context learning approaches across different model sizes and settings.

Findings

01

Fine-tuned LLMs outperform non-LLM baselines in SATD identification.

02

For SATD classification, larger models with contextual info improve performance.

03

Few-shot in-context learning can surpass fine-tuned smaller models in classification.

Abstract

Self-Admitted Technical Debt (SATD), a concept highlighting sub-optimal choices in software development documented in code comments or other project resources, poses challenges in the maintainability and evolution of software systems. Large language models (LLMs) have demonstrated significant effectiveness across a broad range of software tasks, especially in software text generation tasks. Nonetheless, their effectiveness in tasks related to SATD is still under-researched. In this paper, we investigate the efficacy of LLMs in both identification and classification of SATD. For both tasks, we investigate the performance gain from using more recent LLMs, specifically the Flan-T5 family, across different common usage settings. Our results demonstrate that for SATD identification, all fine-tuned LLMs outperform the best existing non-LLM baseline, i.e., the CNN model, with a 4.4% to 7.2%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RISElabQueens/SATD_LLM
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques