Evaluating Large Language Models in Detecting Test Smells

Keila Lucas; Rohit Gheyi; Elvys Soares; M\'arcio Ribeiro; Ivan; Machado

arXiv:2407.19261·cs.SE·July 31, 2024

Evaluating Large Language Models in Detecting Test Smells

Keila Lucas, Rohit Gheyi, Elvys Soares, M\'arcio Ribeiro, Ivan, Machado

PDF

Open Access

TL;DR

This paper evaluates the effectiveness of large language models like ChatGPT-4, Mistral Large, and Gemini Advanced in automatically detecting various test smells across multiple programming languages.

Contribution

It provides an empirical assessment of LLMs' capability to identify test smells, highlighting their potential as tools for improving software quality.

Findings

01

ChatGPT-4 identified 21 test smell types

02

Gemini Advanced identified 17 test smell types

03

Mistral Large detected 15 test smell types

Abstract

Test smells are coding issues that typically arise from inadequate practices, a lack of knowledge about effective testing, or deadline pressures to complete projects. The presence of test smells can negatively impact the maintainability and reliability of software. While there are tools that use advanced static analysis or machine learning techniques to detect test smells, these tools often require effort to be used. This study aims to evaluate the capability of Large Language Models (LLMs) in automatically detecting test smells. We evaluated ChatGPT-4, Mistral Large, and Gemini Advanced using 30 types of test smells across codebases in seven different programming languages collected from the literature. ChatGPT-4 identified 21 types of test smells. Gemini Advanced identified 17 types, while Mistral Large detected 15 types of test smells. Conclusion: The LLMs demonstrated potential as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Chemical Sensor Technologies · Advanced Text Analysis Techniques · Sentiment Analysis and Opinion Mining