Exploring the Limits of ChatGPT in Software Security Applications

Fangzhou Wu; Qingzhao Zhang; Ati Priya Bajaj; Tiffany Bao; Ning Zhang,; Ruoyu "Fish" Wang; Chaowei Xiao

arXiv:2312.05275·cs.CR·December 12, 2023·2 cites

Exploring the Limits of ChatGPT in Software Security Applications

Fangzhou Wu, Qingzhao Zhang, Ati Priya Bajaj, Tiffany Bao, Ning Zhang,, Ruoyu "Fish" Wang, Chaowei Xiao

PDF

Open Access

TL;DR

This paper investigates the capabilities and limitations of ChatGPT, especially GPT-4, across various software security tasks, revealing its strengths in code understanding and generation, as well as areas needing improvement.

Contribution

The study systematically evaluates ChatGPT's performance in seven security applications, highlighting its strengths and limitations in the security domain.

Findings

01

GPT-4 outperforms GPT-3.5 in most security tasks.

02

ChatGPT demonstrates strong understanding of control/data flows and code generation.

03

Limitations include difficulty processing long code contexts.

Abstract

Large language models (LLMs) have undergone rapid evolution and achieved remarkable results in recent times. OpenAI's ChatGPT, backed by GPT-3.5 or GPT-4, has gained instant popularity due to its strong capability across a wide range of tasks, including natural language tasks, coding, mathematics, and engaging conversations. However, the impacts and limits of such LLMs in system security domain are less explored. In this paper, we delve into the limits of LLMs (i.e., ChatGPT) in seven software security applications including vulnerability detection/repair, debugging, debloating, decompilation, patching, root cause analysis, symbolic execution, and fuzzing. Our exploration reveals that ChatGPT not only excels at generating code, which is the conventional application of language models, but also demonstrates strong capability in understanding user-provided commands in natural languages,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Reliability and Analysis Research

MethodsMulti-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Label Smoothing · Adam · {Dispute@FaQ-s}How to file a dispute with Expedia? · Weight Decay