A System for Automated Unit Test Generation Using Large Language Models and Assessment of Generated Test Suites
Andrea Lops, Fedelucio Narducci, Azzurra Ragone, Michelantonio Trizio, Claudio Bartolini

TL;DR
This paper introduces AgoneTest, an automated system leveraging large language models to generate and evaluate complex, class-level Java test suites, addressing previous limitations of small-scale, manual, and less realistic test evaluations.
Contribution
The paper presents a scalable automated system for generating and assessing class-level Java test suites using LLMs, along with a new dataset and evaluation methodology for real-world applicability.
Findings
Automated system successfully generates complex Java test suites.
Evaluation methodology provides detailed assessment of test quality.
New dataset enables comparison between human-written and generated tests.
Abstract
Unit tests represent the most basic level of testing within the software testing lifecycle and are crucial to ensuring software correctness. Designing and creating unit tests is a costly and labor-intensive process that is ripe for automation. Recently, Large Language Models (LLMs) have been applied to various aspects of software development, including unit test generation. Although several empirical studies evaluating LLMs' capabilities in test code generation exist, they primarily focus on simple scenarios, such as the straightforward generation of unit tests for individual methods. These evaluations often involve independent and small-scale test units, providing a limited view of LLMs' performance in real-world software development scenarios. Moreover, previous studies do not approach the problem at a suitable scale for real-life applications. Generated unit tests are often evaluated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Testing and Debugging Techniques
MethodsFocus
