TL;DR
This paper presents CodePori, a large-scale multi-agent system utilizing LLMs for autonomous software development, evaluated through participant feedback to identify strengths and limitations beyond standard benchmarks.
Contribution
It introduces a novel multi-agent system, CodePori, and provides empirical insights into its practical performance and challenges in real-world software development.
Findings
Participant feedback highlights key strengths and challenges.
Standard benchmarks miss critical aspects of real-world performance.
Addressing memory, hallucinations, and code quality is essential for success.
Abstract
Context: LLM-based multi-agent systems enable automation and decision support in software development, yet existing studies rely on benchmark datasets offering only binary pass-or-fail results, limiting insight into real-world applicability. Objective: This study empirically investigates the potential and limitations of LLM-based agents in autonomous software development tasks. Method: A two-phase approach was employed: developing a multi-agent system, CodePori, for automated code generation, and conducting participant-based evaluation to assess practical performance. Results: Participant feedback reveals key strengths, challenges, and areas for improvement in LLM-based multi-agent systems, highlighting aspects missed by standard code-generation benchmarks. Conclusions: While LLM-based multi-agent systems show potential for large-scale software development, successful integration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Multi-Agent Systems and Negotiation · Open Source Software Innovations
