Llama-based source code vulnerability detection: Prompt engineering vs Fine tuning
Dyna Soumhane Ouchebara, St\'ephane Dupont

TL;DR
This paper evaluates Llama-3.1 8B for source code vulnerability detection, comparing prompt engineering and fine-tuning techniques, introducing Double Fine-tuning, and highlighting the importance of fine-tuning for improved performance.
Contribution
It introduces a novel Double Fine-tuning approach and assesses various fine-tuning and prompt strategies for LLM-based vulnerability detection.
Findings
Fine-tuning significantly improves detection performance.
Double Fine-tuning outperforms other methods.
Prompt engineering alone is ineffective for this task.
Abstract
The significant increase in software production, driven by the acceleration of development cycles over the past two decades, has led to a steady rise in software vulnerabilities, as shown by statistics published yearly by the CVE program. The automation of the source code vulnerability detection (CVD) process has thus become essential, and several methods have been proposed ranging from the well established program analysis techniques to the more recent AI-based methods. Our research investigates Large Language Models (LLMs), which are considered among the most performant AI models to date, for the CVD task. The objective is to study their performance and apply different state-of-the-art techniques to enhance their effectiveness for this task. We explore various fine-tuning and prompt engineering settings. We particularly suggest one novel approach for fine-tuning LLMs which we call…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Web Application Security Vulnerabilities
