Learning to Quantize Vulnerability Patterns and Match to Locate   Statement-Level Vulnerabilities

Michael Fu; Trung Le; Van Nguyen; Chakkrit Tantithamthavorn; Dinh; Phung

arXiv:2306.06109·cs.CR·June 13, 2023·2 cites

Learning to Quantize Vulnerability Patterns and Match to Locate Statement-Level Vulnerabilities

Michael Fu, Trung Le, Van Nguyen, Chakkrit Tantithamthavorn, Dinh, Phung

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel vulnerability-matching approach that learns a codebook of quantized vulnerability patterns, significantly improving the accuracy of locating statement-level vulnerabilities in software programs using deep learning.

Contribution

The paper proposes a new vulnerability pattern learning and matching method that leverages a codebook of quantized vectors, enhancing vulnerability detection accuracy over previous approaches.

Findings

01

Achieved 94% F1-score for function-level vulnerability detection.

02

Achieved 82% F1-score for statement-level vulnerability detection.

03

Outperformed previous methods by 6% and 19% in respective tasks.

Abstract

Deep learning (DL) models have become increasingly popular in identifying software vulnerabilities. Prior studies found that vulnerabilities across different vulnerable programs may exhibit similar vulnerable scopes, implicitly forming discernible vulnerability patterns that can be learned by DL models through supervised training. However, vulnerable scopes still manifest in various spatial locations and formats within a program, posing challenges for models to accurately identify vulnerable statements. Despite this challenge, state-of-the-art vulnerability detection approaches fail to exploit the vulnerability patterns that arise in vulnerable programs. To take full advantage of vulnerability patterns and unleash the ability of DL models, we propose a novel vulnerability-matching approach in this paper, drawing inspiration from program analysis tools that locate vulnerabilities based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

optimatch/optimatch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Advanced Malware Detection Techniques

Methodsfail