Assessing the Quality and Security of AI-Generated Code: A Quantitative Analysis
Abbas Sabra, Olivier Schmitt, Joseph Tyler

TL;DR
This paper quantitatively evaluates the code quality and security of five major LLMs generating Java code, revealing systemic weaknesses and security vulnerabilities that are not indicated by functional performance metrics.
Contribution
It provides a comprehensive static analysis of LLM-generated code, highlighting shared security and quality issues across multiple models, and emphasizes the importance of verification beyond functional testing.
Findings
LLMs can generate functional code but often include bugs and vulnerabilities.
No correlation between functional success and code security or quality.
Shared systemic weaknesses in LLM code generation methods.
Abstract
This study presents a quantitative evaluation of the code quality and security of five prominent Large Language Models (LLMs): Claude Sonnet 4, Claude 3.7 Sonnet, GPT-4o, Llama 3.2 90B, and OpenCoder 8B. While prior research has assessed the functional performance of LLM-generated code, this research tested LLM output from 4,442 Java coding assignments through comprehensive static analysis using SonarQube. The findings suggest that although LLMs can generate functional code, they also introduce a range of software defects, including bugs, security vulnerabilities, and code smells. These defects do not appear to be isolated; rather, they may represent shared weaknesses stemming from systemic limitations within current LLM code generation methods. In particular, critically severe issues, such as hard-coded passwords and path traversal vulnerabilities, were observed across multiple models.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLaw, AI, and Intellectual Property · Software Engineering Research · Advanced Malware Detection Techniques
