LLMSecCode: Evaluating Large Language Models for Secure Coding

Anton Ryd\'en; Erik N\"aslund; Elad Michael Schiller; Magnus Almgren

arXiv:2408.16100·cs.CR·August 30, 2024

LLMSecCode: Evaluating Large Language Models for Secure Coding

Anton Ryd\'en, Erik N\"aslund, Elad Michael Schiller, Magnus Almgren

PDF

Open Access 1 Repo

TL;DR

This paper introduces LLMSecCode, an open-source framework for objectively evaluating the cybersecurity capabilities of large language models in secure coding, aiming to standardize benchmarking and improve selection processes.

Contribution

The paper presents a novel open-source evaluation framework for assessing LLMs in secure coding, addressing key research questions and validating its effectiveness through experiments.

Findings

01

Performance varies by 10% with different prompts and parameters.

02

Comparison with external actors shows a 5% performance difference.

03

Framework facilitates standardized benchmarking of LLMs in security tasks.

Abstract

The rapid deployment of Large Language Models (LLMs) requires careful consideration of their effect on cybersecurity. Our work aims to improve the selection process of LLMs that are suitable for facilitating Secure Coding (SC). This raises challenging research questions, such as (RQ1) Which functionality can streamline the LLM evaluation? (RQ2) What should the evaluation measure? (RQ3) How to attest that the evaluation process is impartial? To address these questions, we introduce LLMSecCode, an open-source evaluation framework designed to assess LLM SC capabilities objectively. We validate the LLMSecCode implementation through experiments. When varying parameters and prompts, we find a 10% and 9% difference in performance, respectively. We also compare some results to reliable external actors, where our results show a 5% difference. We strive to ensure the ease of use of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anton-ryden/LLMSecCode
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCancer Genomics and Diagnostics