Detecting software vulnerabilities using Language Models
Marwan Omar

TL;DR
This paper introduces VulDetect, a transformer-based framework that fine-tunes a large language model to detect software vulnerabilities with high accuracy, outperforming existing methods while reducing computational overhead.
Contribution
The paper presents a novel transformer-based vulnerability detection framework that leverages fine-tuning of pre-trained language models, improving accuracy and efficiency over prior deep learning approaches.
Findings
Achieves up to 92.65% detection accuracy.
Outperforms SyseVR and VulDeBERT in benchmark tests.
Reduces computational overhead compared to CNN and LSTM models.
Abstract
Recently, deep learning techniques have garnered substantial attention for their ability to identify vulnerable code patterns accurately. However, current state-of-the-art deep learning models, such as Convolutional Neural Networks (CNN), and Long Short-Term Memories (LSTMs) require substantial computational resources. This results in a level of overhead that makes their implementation unfeasible for deployment in realtime settings. This study presents a novel transformer-based vulnerability detection framework, referred to as VulDetect, which is achieved through the fine-tuning of a pre-trained large language model, (GPT) on various benchmark datasets of vulnerable code. Our empirical findings indicate that our framework is capable of identifying vulnerable software code with an accuracy of up to 92.65%. Our proposed technique outperforms SyseVR and VulDeBERT, two state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Advanced Malware Detection Techniques
