Variables are a Curse in Software Vulnerability Prediction

Jinghua Groppe; Sven Groppe; Ralf M\"oller

arXiv:2407.02509·cs.SE·July 4, 2024

Variables are a Curse in Software Vulnerability Prediction

Jinghua Groppe, Sven Groppe, Ralf M\"oller

PDF

TL;DR

This paper introduces a novel approach to software vulnerability prediction that removes variable naming dependencies, enabling models to better understand code functionality and significantly reduce memory usage.

Contribution

The paper proposes a new edge type called name dependence and a 3-property encoding scheme to abstract variable names, improving vulnerability prediction and memory efficiency.

Findings

01

Models with the new techniques outperform existing approaches in vulnerability prediction.

02

Memory usage is reduced by up to 30,000 times with the proposed methods.

03

The approach enhances understanding of code functionality beyond surface text.

Abstract

Deep learning-based approaches for software vulnerability prediction currently mainly rely on the original text of software code as the feature of nodes in the graph of code and thus could learn a representation that is only specific to the code text, rather than the representation that depicts the 'intrinsic' functionality of a program hidden in the text representation. One curse that causes this problem is an infinite number of possibilities to name a variable. In order to lift the curse, in this work we introduce a new type of edge called name dependence, a type of abstract syntax graph based on the name dependence, and an efficient node representation method named 3-property encoding scheme. These techniques will allow us to remove the concrete variable names from code, and facilitate deep learning models to learn the functionality of software hidden in diverse code expressions. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.