Boosting LLMs for Mutation Generation
Bo Wang, Ming Deng, Mingda Chen, Chengran Yang, Youfang Lin, Mark Harman, Mike Papadakis, Jie M. Zhang

TL;DR
This paper introduces SMART, an advanced mutation generation method for LLMs that leverages retrieval-augmented generation and fine-tuning to produce higher quality, more effective mutants for software testing, surpassing existing approaches.
Contribution
SMART integrates retrieval-augmented generation and supervised fine-tuning to significantly improve mutation quality and effectiveness in LLM-based mutation testing.
Findings
Increases mutation validity from 42.89% to 65.6%.
Achieves bug detection rate of 92.61%, surpassing prior methods.
Enhances fault localization ranking by 64 bugs as Top-1.
Abstract
LLM-based mutation testing is a promising testing technology, but existing approaches typically rely on a fixed set of mutations as few-shot examples or none at all. This can result in generic low-quality mutations, missed context-specific mutation patterns, substantial numbers of redundant and uncompilable mutants, and limited semantic similarity to real bugs. To overcome these limitations, we introduce SMART (Semantic Mutation with Adaptive Retrieval and Tuning). SMART integrates retrieval-augmented generation (RAG) on a vectorized dataset of real-world bugs, focused code chunking, and supervised fine-tuning using mutations coupled with real-world bugs. We conducted an extensive empirical study of SMART using 1,991 real-world Java bugs from the Defects4J and ConDefects datasets, comparing SMART to the state-of-the-art LLM-based approaches, LLMut and LLMorpheus. The results reveal that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Advanced Malware Detection Techniques
