Security and Quality in LLM-Generated Code: A Multi-Language, Multi-Model Analysis
Mohammed Kharma, Soohyeon Choi, Mohammed AlKhanafseh, David Mohaisen

TL;DR
This study evaluates the security and quality of code generated by Large Language Models across multiple programming languages, revealing variability in security effectiveness and highlighting areas for improvement in AI-driven code generation.
Contribution
It introduces a new dataset and comprehensive analysis of LLM-generated code security across languages, emphasizing the need for integrating modern security practices into AI models.
Findings
LLMs' security performance varies by language
Many models do not utilize recent security features
Outdated methods are still prevalent in generated code
Abstract
Artificial Intelligence (AI)-driven code generation tools are increasingly used throughout the software development lifecycle to accelerate coding tasks. However, the security of AI-generated code using Large Language Models (LLMs) remains underexplored, with studies revealing various risks and weaknesses. This paper analyzes the security of code generated by LLMs across different programming languages. We introduce a dataset of 200 tasks grouped into six categories to evaluate the performance of LLMs in generating secure and maintainable code. Our research shows that while LLMs can automate code creation, their security effectiveness varies by language. Many models fail to utilize modern security features in recent compiler and toolkit updates, such as Java 17. Moreover, outdated methods are still commonly used, particularly in C++. This highlights the need for advancing LLMs to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security
