Ocassionally Secure: A Comparative Analysis of Code Generation Assistants
Ran Elgedawy, Porter Dosch, John Sadik, Senjuti Dutta, Anuj Gautam, Konstantinos Georgiou, Farzin Gholamrezae, Fujiao Ji, Kyungchan Lim, Qian Liu, and Scott Ruoti

TL;DR
This study compares four advanced LLMs in code generation tasks, focusing on security, functionality, and reliability to understand their effective deployment in real-world developer scenarios.
Contribution
It provides a comparative analysis of multiple LLMs' code generation capabilities, emphasizing security and practical use cases for developers.
Findings
GPT-4 and Gemini outperform others in code correctness.
Security awareness varies significantly across models.
Model performance depends on task complexity and context.
Abstract
Large Language Models (LLMs) are being increasingly utilized in various applications, with code generations being a notable example. While previous research has shown that LLMs have the capability to generate both secure and insecure code, the literature does not take into account what factors help generate secure and effective code. Therefore in this paper we focus on identifying and understanding the conditions and contexts in which LLMs can be effectively and safely deployed in real-world scenarios to generate quality code. We conducted a comparative analysis of four advanced LLMs--GPT-3.5 and GPT-4 using ChatGPT and Bard and Gemini from Google--using 9 separate tasks to assess each model's code generation capabilities. We contextualized our study to represent the typical use cases of a real-life developer employing LLMs for everyday tasks as work. Additionally, we place an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTeaching and Learning Programming
MethodsAttention Is All You Need · Linear Layer · Residual Connection · Layer Normalization · Multi-Head Attention · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Position-Wise Feed-Forward Layer · Label Smoothing
