Unit Testing Past vs. Present: Examining LLMs' Impact on Defect   Detection and Efficiency

Rudolf Ramler; Philipp Straubinger; Reinhold Pl\"osch; Dietmar Winkler

arXiv:2502.09801·cs.SE·February 17, 2025

Unit Testing Past vs. Present: Examining LLMs' Impact on Defect Detection and Efficiency

Rudolf Ramler, Philipp Straubinger, Reinhold Pl\"osch, Dietmar Winkler

PDF

Open Access

TL;DR

This study empirically evaluates how Large Language Models like ChatGPT and GitHub Copilot impact defect detection and efficiency in unit testing, showing significant improvements over manual testing methods.

Contribution

It provides the first empirical comparison of LLM-supported versus manual unit testing, demonstrating increased defect detection and testing efficiency with LLM support.

Findings

01

LLM support increases the number of unit tests generated

02

LLM support improves defect detection rates

03

LLM support enhances overall testing efficiency

Abstract

The integration of Large Language Models (LLMs), such as ChatGPT and GitHub Copilot, into software engineering workflows has shown potential to enhance productivity, particularly in software testing. This paper investigates whether LLM support improves defect detection effectiveness during unit testing. Building on prior studies comparing manual and tool-supported testing, we replicated and extended an experiment where participants wrote unit tests for a Java-based system with seeded defects within a time-boxed session, supported by LLMs. Comparing LLM supported and manual testing, results show that LLM support significantly increases the number of unit tests generated, defect detection rates, and overall testing efficiency. These findings highlight the potential of LLMs to improve testing and defect detection outcomes, providing empirical insights into their practical application in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVLSI and Analog Circuit Testing