LLM4TDD: Best Practices for Test Driven Development Using Large Language Models
Sanyogita Piya, Allison Sullivan

TL;DR
This paper explores how guiding Large Language Models with test-driven development practices can improve code generation, evaluated through empirical experiments with ChatGPT on LeetCode problems.
Contribution
It introduces LLM4TDD, a novel approach integrating TDD principles with LLMs for iterative code synthesis, and provides empirical insights into its effectiveness.
Findings
Test-driven prompts improve code correctness.
Prompt and problem attributes significantly affect LLM performance.
Empirical evaluation demonstrates potential of LLM4TDD in software development.
Abstract
In today's society, we are becoming increasingly dependent on software systems. However, we also constantly witness the negative impacts of buggy software. Program synthesis aims to improve software correctness by automatically generating the program given an outline of the expected behavior. For decades, program synthesis has been an active research field, with recent approaches looking to incorporate Large Language Models to help generate code. This paper explores the concept of LLM4TDD, where we guide Large Language Models to generate code iteratively using a test-driven development methodology. We conduct an empirical evaluation using ChatGPT and coding problems from LeetCode to investigate the impact of different test, prompt and problem attributes on the efficacy of LLM4TDD.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Software Reliability and Analysis Research
