Vulnerability Detection in C/C++ Code with Deep Learning
Zhen Huang, Amy Aumpansub

TL;DR
This paper presents a deep learning approach using neural networks trained on program slices from C/C++ source code to accurately detect software vulnerabilities, achieving over 92% accuracy.
Contribution
It introduces a method that combines syntax and semantic features of code slices and compares neural network types and training strategies for vulnerability detection.
Findings
BGRU with ADAM optimizer achieves 92.49% accuracy.
Combining multiple code characteristics improves prediction balance.
Balanced training data enhances vulnerability detection performance.
Abstract
Deep learning has been shown to be a promising tool in detecting software vulnerabilities. In this work, we train neural networks with program slices extracted from the source code of C/C++ programs to detect software vulnerabilities. The program slices capture the syntax and semantic characteristics of vulnerability-related program constructs, including API function call, array usage, pointer usage, and arithmetic expression. To achieve a strong prediction model for both vulnerable code and non-vulnerable code, we compare different types of training data, different optimizers, and different types of neural networks. Our result shows that combining different types of characteristics of source code and using a balanced number of vulnerable program slices and non-vulnerable program slices produce a balanced accuracy in predicting both vulnerable code and non-vulnerable code. Among…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Anomaly Detection Techniques and Applications · Advanced Malware Detection Techniques
MethodsAdam
