RETAIN: Interactive Tool for Regression Testing Guided LLM Migration
Tanay Dixit, Daniel Lee, Sally Fang, Sai Sree Harsha, Anirudh, Sureshan, Akash Maharaj, Yunyao Li

TL;DR
RETAIN is an interactive tool designed to assist developers in regression testing during LLM migrations, helping identify errors and differences in model outputs more efficiently than manual methods.
Contribution
The paper introduces RETAIN, a novel regression testing tool with an interactive interface and error discovery module tailored for LLM migrations, improving error detection and prompt experimentation.
Findings
RETAIN enabled participants to identify twice as many errors as manual evaluation.
Participants could experiment with 75% more prompts using RETAIN.
RETAIN achieved 12% higher metric scores in a given time frame.
Abstract
Large Language Models (LLMs) are increasingly integrated into diverse applications. The rapid evolution of LLMs presents opportunities for developers to enhance applications continuously. However, this constant adaptation can also lead to performance regressions during model migrations. While several interactive tools have been proposed to streamline the complexity of prompt engineering, few address the specific requirements of regression testing for LLM Migrations. To bridge this gap, we introduce RETAIN (REgression Testing guided LLM migrAtIoN), a tool designed explicitly for regression testing in LLM Migrations. RETAIN comprises two key components: an interactive interface tailored to regression testing needs during LLM migrations, and an error discovery module that facilitates understanding of differences in model behaviors. The error discovery module generates textual descriptions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques
