CodeCoR: An LLM-Based Self-Reflective Multi-Agent Framework for Code Generation
Ruwei Pan, Hongyu Zhang, Chao Liu

TL;DR
CodeCoR introduces a self-reflective multi-agent framework that enhances code generation by evaluating and iteratively repairing code, significantly improving correctness and robustness over existing methods.
Contribution
It presents a novel self-reflective multi-agent system that evaluates, prunes, and repairs generated code, leading to superior performance in automated code generation tasks.
Findings
Achieves an average Pass@1 score of 77.8% on four datasets.
Outperforms existing baselines like CodeCoT and MapCoder.
Demonstrates robustness through iterative testing and repair processes.
Abstract
Code generation aims to produce code that fulfills requirements written in natural languages automatically. Large language Models (LLMs) like ChatGPT have demonstrated promising effectiveness in this area. Nonetheless, these LLMs often fail to ensure the syntactic and semantic correctness of the generated code. Recently, researchers proposed multi-agent frameworks that guide LLMs with different prompts to analyze programming tasks, generate code, perform testing in a sequential workflow. However, the performance of the workflow is not robust as the code generation depends on the performance of each agent. To address this challenge, we propose CodeCoR, a self-reflective multi-agent framework that evaluates the effectiveness of each agent and their collaborations. Specifically, for a given task description, four agents in CodeCoR generate prompts, code, test cases, and repair advice,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
