Ever-Improving Test Suite by Leveraging Large Language Models

Ketai Qiu

arXiv:2506.11000·cs.SE·October 23, 2025

Ever-Improving Test Suite by Leveraging Large Language Models

Ketai Qiu

PDF

TL;DR

This paper introduces E-Test, a method that uses Large Language Models to incrementally enhance software test suites by identifying untested and error-prone behaviors, improving testing effectiveness.

Contribution

E-Test is the first approach to leverage Large Language Models for incremental test suite augmentation based on production behavior analysis.

Findings

01

E-Test outperforms existing methods in identifying inadequately tested behaviors.

02

E-Test effectively augments test suites to cover emergent production behaviors.

03

Experimental results demonstrate improved test suite quality and coverage.

Abstract

Augmenting test suites with test cases that reflect the actual usage of the software system is extremely important to sustain the quality of long lasting software systems. In this paper, we propose E-Test, an approach that incrementally augments a test suite with test cases that exercise behaviors that emerge in production and that are not been tested yet. E-Test leverages Large Language Models to identify already-tested, not-yet-tested, and error-prone unit execution scenarios, and augment the test suite accordingly. Our experimental evaluation shows that E-Test outperforms the main state-of-the-art approaches to identify inadequately tested behaviors and optimize test suites.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.