WildCode: An Empirical Analysis of Code Generated by ChatGPT

Kobra Khanmohammadi; Pooria Roy; Raphael Khoury; Abdelwahab Hamou-Lhadj; and Wilfried Patrick Konan

arXiv:2512.04259·cs.CR·December 5, 2025

WildCode: An Empirical Analysis of Code Generated by ChatGPT

Kobra Khanmohammadi, Pooria Roy, Raphael Khoury, Abdelwahab Hamou-Lhadj, and Wilfried Patrick Konan

PDF

Open Access

TL;DR

This paper presents a large-scale empirical analysis of real-world code generated by ChatGPT, assessing its correctness and security, and examining user behavior regarding security concerns.

Contribution

It provides the first large-scale real-world evaluation of ChatGPT-generated code's correctness and security, highlighting user attitudes and the limitations of AI-generated code.

Findings

01

LLM-generated code often lacks security robustness

02

Users rarely inquire about security features of generated code

03

Real-world code quality aligns with previous synthetic studies

Abstract

LLM models are increasingly used to generate code, but the quality and security of this code are often uncertain. Several recent studies have raised alarm bells, indicating that such AI-generated code may be particularly vulnerable to cyberattacks. However, most of these studies rely on code that is generated specifically for the study, which raises questions about the realism of such experiments. In this study, we perform a large-scale empirical analysis of real-life code generated by ChatGPT. We evaluate code generated by ChatGPT both with respect to correctness and security and delve into the intentions of users who request code from the model. Our research confirms previous studies that used synthetic queries and yielded evidence that LLM-generated code is often inadequate with respect to security. We also find that users exhibit little curiosity about the security features of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Artificial Intelligence in Healthcare and Education · Software Engineering Research