A comparative study of neural network techniques for automatic software   vulnerability detection

Gaigai Tang; Lianxiao Meng; Shuangyin Ren; Weipeng Cao; Qiang Wang,; Lin Yang

arXiv:2104.14978·cs.SE·May 3, 2021

A comparative study of neural network techniques for automatic software vulnerability detection

Gaigai Tang, Lianxiao Meng, Shuangyin Ren, Weipeng Cao, Qiang Wang,, Lin Yang

PDF

TL;DR

This study compares neural network techniques for automatic software vulnerability detection, highlighting how different models and preprocessing methods impact performance and providing practical guidelines for researchers.

Contribution

The paper systematically evaluates Bi-LSTM and RVFL neural networks with vector and symbolization preprocessing for vulnerability detection, offering insights into their relative strengths.

Findings

01

RVFL trains faster than Bi-LSTM but has lower accuracy.

02

Doc2Vec improves training speed and generalization over Word2Vec.

03

Multi-level symbolization enhances model precision.

Abstract

Software vulnerabilities are usually caused by design flaws or implementation errors, which could be exploited to cause damage to the security of the system. At present, the most commonly used method for detecting software vulnerabilities is static analysis. Most of the related technologies work based on rules or code similarity (source code level) and rely on manually defined vulnerability features. However, these rules and vulnerability features are difficult to be defined and designed accurately, which makes static analysis face many challenges in practical applications. To alleviate this problem, some researchers have proposed to use neural networks that have the ability of automatic feature extraction to improve the intelligence of detection. However, there are many types of neural networks, and different data preprocessing methods will have a significant impact on model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Bidirectional LSTM