Debt Behind the AI Boom: A Large-Scale Empirical Study of AI-Generated Code in the Wild
Yue Liu, Ratnadira Widyasari, Yanjie Zhao, Ivana Clairine Irsan, Junkai Chen, David Lo

TL;DR
This large-scale empirical study investigates the long-term impact of AI-generated code on software quality and maintenance, revealing persistent issues and technical debt in real-world repositories.
Contribution
It provides the first extensive analysis of AI-generated code issues in production, quantifying their lifecycle and long-term effects on software maintenance.
Findings
89.3% of issues are code smells, the most common type.
Over 15% of AI commits introduce at least one issue.
22.7% of AI-introduced issues persist in the latest repository versions.
Abstract
AI coding assistants are now widely used in software development. Software developers increasingly integrate AI-generated code into their codebases to improve productivity. Prior studies have shown that AI-generated code may contain code quality issues under controlled settings. However, we still know little about the real-world impact of AI-generated code on software quality and maintenance after it is introduced into production repositories. In other words, it remains unclear whether such issues are quickly fixed or persist and accumulate over time as technical debt. In this paper, we conduct a large-scale empirical study on the technical debt introduced by AI coding assistants in the wild. To achieve that, we built a dataset of 302.6k verified AI-authored commits from 6,299 GitHub repositories, covering five widely used AI coding assistants. For each commit, we run static analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
