Can LLMs be Effective Code Contributors? A Study on Open-source Projects

Chun Jie Chong; Muyeed Ahmed; Zhihao (Zephyr) Yao; Iulian Neamtiu

arXiv:2604.23340·cs.SE·April 28, 2026

Can LLMs be Effective Code Contributors? A Study on Open-source Projects

Chun Jie Chong, Muyeed Ahmed, Zhihao (Zephyr) Yao, Iulian Neamtiu

PDF

TL;DR

This study evaluates the effectiveness of large language models in contributing code to open-source projects, revealing significant shortcomings and variability in success rates across different projects and models.

Contribution

The paper introduces a framework for assessing LLMs' suitability for code contributions and provides empirical results on their performance in real open-source projects.

Findings

01

LLMs' success rate ranged from 0% to 60% across projects.

02

LLMs often generated syntactically incorrect or unverified code.

03

Struggles include generating new code and managing context size.

Abstract

LLM-generated code is widely used, and the share of committed code produced by LLMs is expected to increase. However, we are not at a point where LLMs can be effective contributors to production code. We present an approach that exposes the shortcomings of LLM generation on such projects, and proposes recommendations; the targets of our study are sizable open-source projects, e.g., FFmpeg and wolfSSL. First, we developed a framework that uses verification and validation to evaluate a given LLM's suitability to fix or add features to an existing project. Second, we apply the framework to 212 commits (bug fixes and small feature improvements) in eight popular open-source projects and three LLMs: GPT-4o, Ministral3, and Qwen3-Coder. The success rate varied from 0% to 60% depending on the project. The LLMs failed in a variety of ways, from generating syntactically incorrect code, to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.