AI-Generated Smells: An Analysis of Code and Architecture in LLM and Agent-Driven Development
Yuecai Zhu, Nikolaos Tsantalis, Peter C. Rigby

TL;DR
This paper systematically analyzes AI-generated software, revealing a trade-off between model capability and architectural complexity, emphasizing the need for architectural foresight to ensure maintainability.
Contribution
It uncovers a fundamental reasoning-complexity trade-off in AI-generated code and introduces the Volume-Quality Inverse Law linking code size to structural decay.
Findings
AI does not eliminate flaws but introduces a machine signature of defects.
Increased model capability leads to bloated, coupled code.
Code volume strongly predicts structural degradation.
Abstract
The promise of Large Language Models in automated software engineering is often measured by functional correctness, overlooking the critical issue of long term maintainability. This paper presents a systematic audit of technical debt in AI-generated software, revealing that AI does not eliminate flaws but rather introduces a distinct machine signature of defects. Our multi-scale analysis, spanning single-file algorithmic tasks and complex, agent generated systems, identifies a fundamental Reasoning-Complexity Trade-off: as models become more capable, they generate increasingly bloated and coupled code. This architectural decay is so pronounced that we establish a Volume-Quality Inverse Law, where code volume is a near perfect predictor of structural degradation. Crucially, we demonstrate that neither functional correctness nor detailed prompting mitigates this decay. These findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
