Learning to map source code to software vulnerability using code-as-a-graph
Sahil Suneja, Yunhui Zheng, Yufan Zhuang, Jim Laredo, Alessandro, Morari

TL;DR
This paper demonstrates that graph neural networks applied to code property graphs can effectively identify software vulnerabilities, outperforming traditional static analysis and other deep learning models.
Contribution
The paper introduces AI4VA, a pipeline that encodes source code into graphs and uses Gated Graph Neural Networks for vulnerability detection, showing improved accuracy over existing methods.
Findings
GNNs outperform static analyzers and traditional ML models in vulnerability detection.
Code-as-graph encoding captures semantic information more effectively.
The approach achieves higher accuracy on multiple datasets.
Abstract
We explore the applicability of Graph Neural Networks in learning the nuances of source code from a security perspective. Specifically, whether signatures of vulnerabilities in source code can be learned from its graph representation, in terms of relationships between nodes and edges. We create a pipeline we call AI4VA, which first encodes a sample source code into a Code Property Graph. The extracted graph is then vectorized in a manner which preserves its semantic information. A Gated Graph Neural Network is then trained using several such graphs to automatically extract templates differentiating the graph of a vulnerable sample from a healthy one. Our model outperforms static analyzers, classic machine learning, as well as CNN and RNN-based deep learning models on two of the three datasets we experiment with. We thus show that a code-as-graph encoding is more meaningful for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Advanced Malware Detection Techniques
MethodsGraph Neural Network
