Just another copy and paste? Comparing the security vulnerabilities of ChatGPT generated code and StackOverflow answers
Sivana Hamer, Marcelo d'Amorim, Laurie Williams

TL;DR
This study empirically compares the security vulnerabilities of code snippets generated by ChatGPT and those from StackOverflow, revealing ChatGPT's code has fewer vulnerabilities and types, but both sources pose security risks.
Contribution
It provides the first empirical comparison of security vulnerabilities in AI-generated code versus human-generated code from StackOverflow.
Findings
ChatGPT code has 20% fewer vulnerabilities than StackOverflow snippets.
ChatGPT generated 19 CWE types, fewer than StackOverflow's 22.
Both sources contain numerous unique vulnerabilities, highlighting the need for better security awareness.
Abstract
Sonatype's 2023 report found that 97% of developers and security leads integrate generative Artificial Intelligence (AI), particularly Large Language Models (LLMs), into their development process. Concerns about the security implications of this trend have been raised. Developers are now weighing the benefits and risks of LLMs against other relied-upon information sources, such as StackOverflow (SO), requiring empirical data to inform their choice. In this work, our goal is to raise software developers awareness of the security implications when selecting code snippets by empirically comparing the vulnerabilities of ChatGPT and StackOverflow. To achieve this, we used an existing Java dataset from SO with security-related questions and answers. Then, we asked ChatGPT the same SO questions, gathering the generated code for comparison. After curating the dataset, we analyzed the number and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)
