RGD: Multi-LLM Based Agent Debugger via Refinement and Generation Guidance
Haolin Jin, Zechao Sun, Huaming Chen

TL;DR
The paper introduces RGD, a multi-LLM agent framework that improves code generation accuracy through iterative refinement and debugging, achieving state-of-the-art results on standard datasets.
Contribution
It proposes a novel multi-agent LLM architecture for automated code refinement and debugging, enhancing code quality beyond traditional prompt-based methods.
Findings
Achieves 9.8% improvement on HumanEval dataset.
Achieves 16.2% improvement on MBPP dataset.
Demonstrates effective autonomous code refinement and debugging.
Abstract
Large Language Models (LLMs) have shown incredible potential in code generation tasks, and recent research in prompt engineering have enhanced LLMs' understanding of textual information. However, ensuring the accuracy of generated code often requires extensive testing and validation by programmers. While LLMs can typically generate code based on task descriptions, their accuracy remains limited, especially for complex tasks that require a deeper understanding of both the problem statement and the code generation process. This limitation is primarily due to the LLMs' need to simultaneously comprehend text and generate syntactically and semantically correct code, without having the capability to automatically refine the code. In real-world software development, programmers rarely produce flawless code in a single attempt based on the task description alone, they rely on iterative feedback…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation
