Ocassionally Secure: A Comparative Analysis of Code Generation Assistants

Ran Elgedawy; Porter Dosch; John Sadik; Senjuti Dutta; Anuj Gautam; Konstantinos Georgiou; Farzin Gholamrezae; Fujiao Ji; Kyungchan Lim; Qian Liu; and Scott Ruoti

arXiv:2402.00689·cs.CR·September 30, 2025·2 cites

Ocassionally Secure: A Comparative Analysis of Code Generation Assistants

Ran Elgedawy, Porter Dosch, John Sadik, Senjuti Dutta, Anuj Gautam, Konstantinos Georgiou, Farzin Gholamrezae, Fujiao Ji, Kyungchan Lim, Qian Liu, and Scott Ruoti

PDF

Open Access

TL;DR

This study compares four advanced LLMs in code generation tasks, focusing on security, functionality, and reliability to understand their effective deployment in real-world developer scenarios.

Contribution

It provides a comparative analysis of multiple LLMs' code generation capabilities, emphasizing security and practical use cases for developers.

Findings

01

GPT-4 and Gemini outperform others in code correctness.

02

Security awareness varies significantly across models.

03

Model performance depends on task complexity and context.

Abstract

Large Language Models (LLMs) are being increasingly utilized in various applications, with code generations being a notable example. While previous research has shown that LLMs have the capability to generate both secure and insecure code, the literature does not take into account what factors help generate secure and effective code. Therefore in this paper we focus on identifying and understanding the conditions and contexts in which LLMs can be effectively and safely deployed in real-world scenarios to generate quality code. We conducted a comparative analysis of four advanced LLMs--GPT-3.5 and GPT-4 using ChatGPT and Bard and Gemini from Google--using 9 separate tasks to assess each model's code generation capabilities. We contextualized our study to represent the typical use cases of a real-life developer employing LLMs for everyday tasks as work. Additionally, we place an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTeaching and Learning Programming

MethodsAttention Is All You Need · Linear Layer · Residual Connection · Layer Normalization · Multi-Head Attention · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Position-Wise Feed-Forward Layer · Label Smoothing