Adaptive Testing for LLM-Based Applications: A Diversity-based Approach
Juyeon Yoon, Robert Feldt, Shin Yoo

TL;DR
This paper introduces a diversity-based adaptive testing method for LLM applications that improves failure detection efficiency and output variety by selecting test prompts based on string distance metrics.
Contribution
It adapts the Adaptive Random Testing technique to prompt testing in LLMs, optimizing test suite curation with diversity measures.
Findings
Reduces testing costs while discovering failures.
Enhances output diversity in LLM testing.
Effective with various string distance metrics.
Abstract
The recent surge of building software systems powered by Large Language Models (LLMs) has led to the development of various testing frameworks, primarily focused on treating prompt templates as the unit of testing. Despite the significant costs associated with test input execution and output assessment, the curation of optimized test suites is yet overlooked in these tools, which calls for tailored test selection or prioritization strategies. In this paper, we show that diversity-based testing techniques, such as Adaptive Random Testing (ART) with appropriate string distance metrics, can be effectively applied to the testing of prompt templates. Our proposed adaptive testing approach adjusts the conventional ART process to this context by selecting new test inputs based on scores derived from existing test suite and their labelling results. Our results, obtained using various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Digital Rights Management and Security
