An empirical study of LoRA-based fine-tuning of large language models for automated test case generation
Milad Moradi, Ke Yan, David Colwell, Rhona Asgari

TL;DR
This study empirically evaluates LoRA-based fine-tuning of large language models for automated test case generation, showing significant performance improvements, especially for open-source models, and comparing them to proprietary models like GPT-4.
Contribution
It systematically explores hyperparameters of LoRA fine-tuning across multiple models and introduces an automated GPT-4o-based evaluation framework for test case quality assessment.
Findings
LoRA fine-tuning improves open-source model performance significantly.
An 8B open-source model can match GPT-4.1 performance after fine-tuning.
Fine-tuned open-source models are viable cost-effective alternatives to proprietary systems.
Abstract
Automated test case generation from natural language requirements remains a challenging problem in software engineering due to the ambiguity of requirements and the need to produce structured, executable test artifacts. Recent advances in LLMs have shown promise in addressing this task; however, their effectiveness depends on task-specific adaptation and efficient fine-tuning strategies. In this paper, we present a comprehensive empirical study on the use of parameter-efficient fine-tuning, specifically LoRA, for requirement-based test case generation. We evaluate multiple LLM families, including open-source and proprietary models, under a unified experimental pipeline. The study systematically explores the impact of key LoRA hyperparameters, including rank, scaling factor, and dropout, on downstream performance. We propose an automated evaluation framework based on GPT-4o, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
