Automatic Generation of Behavioral Test Cases For Natural Language Processing Using Clustering and Prompting
Ying Li, Rahul Singh, Tarun Joshi, Agus Sudjianto

TL;DR
This paper presents an automated method for generating behavioral test cases for NLP models by clustering text representations and using prompting techniques, reducing manual effort and enabling comprehensive model evaluation.
Contribution
It introduces a novel automated approach combining clustering and prompting to generate behavioral test cases for NLP models, improving efficiency over manual methods.
Findings
Effective clustering of text representations for test case generation
Successful application to Amazon Reviews dataset
Analysis of model strengths and weaknesses
Abstract
Recent work in behavioral testing for natural language processing (NLP) models, such as Checklist, is inspired by related paradigms in software engineering testing. They allow evaluation of general linguistic capabilities and domain understanding, hence can help evaluate conceptual soundness and identify model weaknesses. However, a major challenge is the creation of test cases. The current packages rely on semi-automated approach using manual development which requires domain expertise and can be time consuming. This paper introduces an automated approach to develop test cases by exploiting the power of large language models and statistical techniques. It clusters the text representations to carefully construct meaningful groups and then apply prompting techniques to automatically generate Minimal Functionality Tests (MFT). The well-known Amazon Reviews corpus is used to demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Natural Language Processing Techniques · Software Engineering Research
