LLMLOOP: Improving LLM-Generated Code and Tests through Automated Iterative Feedback Loops
Ravin Ravi, Dylan Bradshaw, Stefano Ruberto, Gunel Jahangirova, and Valerio Terragni

TL;DR
LLMLOOP is a framework that automates iterative feedback loops to enhance the quality of code and tests generated by Large Language Models, reducing manual effort and improving correctness.
Contribution
This paper introduces LLMLOOP, a novel automated framework that refines LLM-generated code and tests through multiple iterative feedback loops.
Findings
Significantly reduces compilation and static analysis errors.
Improves test case quality and coverage.
Effective on HUMANEVAL-X benchmark.
Abstract
Large Language Models (LLMs) are showing remarkable performance in generating source code, yet the generated code often has issues like compilation errors or incorrect code. Researchers and developers often face wasted effort in implementing checks and refining LLM-generated code, frequently duplicating their efforts. This paper presents LLMLOOP, a framework that automates the refinement of both source code and test cases produced by LLMs. LLMLOOP employs five iterative loops: resolving compilation errors, addressing static analysis issues, fixing test case failures, and improving test quality through mutation analysis. These loops ensure the generation of high-quality test cases that serve as both a validation mechanism and a regression test suite for the generated code. We evaluated LLMLOOP on HUMANEVAL-X, a recent benchmark of programming tasks. Results demonstrate the tool's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
