Code-Aware Prompting: A study of Coverage Guided Test Generation in Regression Setting using LLM
Gabriel Ryan, Siddhartha Jain, Mingyue Shang, Shiqi Wang, Xiaofei Ma,, Murali Krishna Ramanathan, Baishakhi Ray

TL;DR
This paper introduces SymPrompt, a code-aware multi-stage prompting strategy for LLMs that significantly improves test coverage and correctness in software testing, especially for complex Python methods.
Contribution
It proposes a novel multi-stage prompting approach that aligns with execution paths, enhancing LLM-generated test coverage without additional training.
Findings
SymPrompt increases correct test generation by 5 times.
Coverage improves by 26% with SymPrompt on Python projects.
Using GPT-4, SymPrompt doubles the coverage compared to baseline prompts.
Abstract
Testing plays a pivotal role in ensuring software quality, yet conventional Search Based Software Testing (SBST) methods often struggle with complex software units, achieving suboptimal test coverage. Recent works using large language models (LLMs) for test generation have focused on improving generation quality through optimizing the test generation context and correcting errors in model outputs, but use fixed prompting strategies that prompt the model to generate tests without additional guidance. As a result LLM-generated testsuites still suffer from low coverage. In this paper, we present SymPrompt, a code-aware prompting strategy for LLMs in test generation. SymPrompt's approach is based on recent work that demonstrates LLMs can solve more complex logical problems when prompted to reason about the problem in a multi-step fashion. We apply this methodology to test generation by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
MethodsAttention Is All You Need · Linear Layer · Residual Connection · Layer Normalization · Multi-Head Attention · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Position-Wise Feed-Forward Layer · Label Smoothing
