Comparing Unidirectional, Bidirectional, and Word2vec Models for Discovering Vulnerabilities in Compiled Lifted Code

Gary A. McCully; John D. Hastings; Shengjie Xu; Adam Fortier

arXiv:2409.17513·cs.CR·June 5, 2025

Comparing Unidirectional, Bidirectional, and Word2vec Models for Discovering Vulnerabilities in Compiled Lifted Code

Gary A. McCully, John D. Hastings, Shengjie Xu, Adam Fortier

PDF

Open Access

TL;DR

This paper compares unidirectional GPT-2 embeddings with bidirectional models for vulnerability detection in compiled code, demonstrating superior performance of GPT-2 in identifying security flaws using neural networks.

Contribution

It introduces the use of unidirectional GPT-2 embeddings for vulnerability detection in compiled code and compares their effectiveness against bidirectional models like BERT and RoBERTa.

Findings

01

GPT-2 embeddings outperform bidirectional models in accuracy and F1-score

02

Unfrozen embedding layers yield the best neural network performance

03

SGD optimizer performs better than Adam in this context

Abstract

Ransomware and other forms of malware cause significant financial and operational damage to organizations by exploiting long-standing and often difficult-to-detect software vulnerabilities. To detect vulnerabilities such as buffer overflows in compiled code, this research investigates the application of unidirectional transformer-based embeddings, specifically GPT-2. Using a dataset of LLVM functions, we trained a GPT-2 model to generate embeddings, which were subsequently used to build LSTM neural networks to differentiate between vulnerable and non-vulnerable code. Our study reveals that embeddings from the GPT-2 model significantly outperform those from bidirectional models of BERT and RoBERTa, achieving an accuracy of 92.5% and an F1-score of 89.7%. LSTM neural networks were developed with both frozen and unfrozen embedding model layers. The model with the highest performance was…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Reliability and Analysis Research · Web Application Security Vulnerabilities · Software Engineering Research

MethodsAttention Is All You Need · Linear Layer · Cosine Annealing · Dense Connections · Multi-Head Attention · Linear Warmup With Linear Decay · Weight Decay · Linear Warmup With Cosine Annealing · WordPiece · Residual Connection