Enhancing Large Language Models with Retrieval Augmented Generation for Software Testing and Inspection Automation
Zoe Fingleton, Nazanin Siavash, Armin Moin

TL;DR
This paper introduces a Retrieval Augmented Generation approach to enhance large language models for automated software testing and inspection, reducing costs and improving effectiveness.
Contribution
It presents a novel RAG pipeline that integrates external knowledge sources to improve LLM performance in software V&V activities.
Findings
RAG improves test case generation accuracy.
RAG enhances code inspection effectiveness.
External context reduces human effort in V&V tasks.
Abstract
In this paper, we focus on automating two of the widely used Verification and Validation (V&V) activities in the Software Development Lifecycle (SDLC): Software testing and software inspection (also known as review). Concerning the former, we concentrate on automated test case generation using Large Language Models (LLMs). For the latter, we enable inspection of the source code by LLMs. To address the known LLM hallucination problem, in which LLMs confidently produce incorrect outputs, we implement a Retrieval Augmented Generation (RAG) pipeline to integrate supplementary knowledge sources and provide additional context to the LLM. Our experimental results indicate that incorporating external context via the RAG pipeline has a generally positive impact on both test case generation and code inspection. This novel approach reduces the total project cost by saving human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
