Impact of Code Context and Prompting Strategies on Automated Unit Test Generation with Modern General-Purpose Large Language Models

Jakub Walczak; Piotr Tomalak; Artur Laskowski

arXiv:2507.14256·cs.SE·July 22, 2025

Impact of Code Context and Prompting Strategies on Automated Unit Test Generation with Modern General-Purpose Large Language Models

Jakub Walczak, Piotr Tomalak, Artur Laskowski

PDF

TL;DR

This study evaluates how code context and prompting strategies influence the effectiveness of automated unit test generation using large language models, showing significant improvements with specific techniques like docstrings and chain-of-thought prompting.

Contribution

It systematically analyzes the effects of code context and prompting strategies on LLM-generated unit tests, highlighting the effectiveness of chain-of-thought prompting and detailed context inclusion.

Findings

01

Including docstrings improves code adequacy.

02

Full implementation context yields smaller gains.

03

Chain-of-thought prompting achieves up to 96.3% branch coverage.

Abstract

Generative AI is gaining increasing attention in software engineering, where testing remains an indispensable reliability mechanism. According to the widely adopted testing pyramid, unit tests constitute the majority of test cases and are often schematic, requiring minimal domain expertise. Automatically generating such tests under the supervision of software engineers can significantly enhance productivity during the development phase of the software lifecycle. This paper investigates the impact of code context and prompting strategies on the quality and adequacy of unit tests generated by various large language models (LLMs) across several families. The results show that including docstrings notably improves code adequacy, while further extending context to the full implementation yields definitely smaller gains. Notably, the chain-of-thought prompting strategy -- applied even to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.