LLM-Based Multi-Agent Systems for Code Generation: A Multi-Vocal Literature Review
Zeeshan Rasheeda, Muhammad Waseema, Kai-Kristian Kemella, Mika Saari, Pekka Abrahamsson

TL;DR
This literature review synthesizes academic and industrial research on LLM-based multi-agent systems for code generation, highlighting motivations, models, challenges, solutions, and future directions.
Contribution
It is the first comprehensive multi-vocal review combining peer-reviewed and grey literature on this topic, providing structured insights and identifying research gaps.
Findings
Classified nine categories of reasons for adopting multi-agent systems
Analyzed models and benchmarks to overview LLM configurations and evaluation practices
Synthesized challenges and solutions into six main categories and 26 subcategories
Abstract
Large Language Models (LLMs) have enabled multi-agent systems to perform autonomous code generation for complex tasks. Despite the recent growth in research and industrial applications in this area, there is little work on synthesizing evidence from both academic and industrial sources to capture the current state of research on LLM-based multi-agent systems for code generation. To this end, we conducted a Multi-Vocal Literature Review (MLR), combining insights from both academia and industry, including peer-reviewed studies and grey literature. The aim of this study is to systematically synthesize and analyze existing knowledge on LLM-based multi-agent systems for code generation. Specifically, the review examines the motivations for their use, employed benchmarks and models, key challenges, proposed solutions, and potential directions for future research. We selected and reviewed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
