Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values

Mart\'in Rodr\'iguez; Gustavo Rossi; and Alejandro Fernandez

arXiv:2505.09830·cs.SE·May 16, 2025

Evaluating Large Language Models for the Generation of Unit Tests with Equivalence Partitions and Boundary Values

Mart\'in Rodr\'iguez, Gustavo Rossi, and Alejandro Fernandez

PDF

Open Access

TL;DR

This paper assesses the capability of Large Language Models to automatically generate unit tests focusing on equivalence partitions and boundary values, highlighting their potential and current limitations compared to manual testing.

Contribution

It introduces an optimized prompt design for LLMs to generate critical test cases and compares their performance with human programmers using both quantitative and qualitative analyses.

Findings

01

LLMs' effectiveness depends on prompt quality and implementation

02

Manual supervision remains essential for reliable test generation

03

LLMs show promise but need further refinement for autonomous testing

Abstract

The design and implementation of unit tests is a complex task many programmers neglect. This research evaluates the potential of Large Language Models (LLMs) in automatically generating test cases, comparing them with manual tests. An optimized prompt was developed, that integrates code and requirements, covering critical cases such as equivalence partitions and boundary values. The strengths and weaknesses of LLMs versus trained programmers were compared through quantitative metrics and manual qualitative analysis. The results show that the effectiveness of LLMs depends on well-designed prompts, robust implementation, and precise requirements. Although flexible and promising, LLMs still require human supervision. This work highlights the importance of manual qualitative analysis as an essential complement to automation in unit test evaluation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques