Adaptive Testing for LLM-Based Applications: A Diversity-based Approach

Juyeon Yoon; Robert Feldt; Shin Yoo

arXiv:2501.13480·cs.SE·January 24, 2025

Adaptive Testing for LLM-Based Applications: A Diversity-based Approach

Juyeon Yoon, Robert Feldt, Shin Yoo

PDF

Open Access

TL;DR

This paper introduces a diversity-based adaptive testing method for LLM applications that improves failure detection efficiency and output variety by selecting test prompts based on string distance metrics.

Contribution

It adapts the Adaptive Random Testing technique to prompt testing in LLMs, optimizing test suite curation with diversity measures.

Findings

01

Reduces testing costs while discovering failures.

02

Enhances output diversity in LLM testing.

03

Effective with various string distance metrics.

Abstract

The recent surge of building software systems powered by Large Language Models (LLMs) has led to the development of various testing frameworks, primarily focused on treating prompt templates as the unit of testing. Despite the significant costs associated with test input execution and output assessment, the curation of optimized test suites is yet overlooked in these tools, which calls for tailored test selection or prioritization strategies. In this paper, we show that diversity-based testing techniques, such as Adaptive Random Testing (ART) with appropriate string distance metrics, can be effectively applied to the testing of prompt templates. Our proposed adaptive testing approach adjusts the conventional ART process to this context by selecting new test inputs based on scores derived from existing test suite and their labelling results. Our results, obtained using various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Digital Rights Management and Security