How the Training Procedure Impacts the Performance of Deep   Learning-based Vulnerability Patching

Antonio Mastropaolo; Vittoria Nardone; Gabriele Bavota; Massimiliano; Di Penta

arXiv:2404.17896·cs.SE·April 30, 2024

How the Training Procedure Impacts the Performance of Deep Learning-based Vulnerability Patching

Antonio Mastropaolo, Vittoria Nardone, Gabriele Bavota, Massimiliano, Di Penta

PDF

Open Access 1 Repo

TL;DR

This study systematically compares training procedures for deep learning models in vulnerability patching, revealing supervised pre-training's superiority and the effectiveness of prompt-tuning for self-supervised models.

Contribution

It provides the first comprehensive comparison of pre-training methods and explores prompt-tuning's impact on vulnerability patching performance.

Findings

01

Supervised pre-training on bug-fixing data significantly improves patching performance.

02

Prompt-tuning enhances self-supervised models without additional data collection.

03

No significant performance gain from prompt-tuning on supervised pre-trained models.

Abstract

Generative deep learning (DL) models have been successfully adopted for vulnerability patching. However, such models require the availability of a large dataset of patches to learn from. To overcome this issue, researchers have proposed to start from models pre-trained with general knowledge, either on the programming language or on similar tasks such as bug fixing. Despite the efforts in the area of automated vulnerability patching, there is a lack of systematic studies on how these different training procedures impact the performance of DL models for such a task. This paper provides a manyfold contribution to bridge this gap, by (i) comparing existing solutions of self-supervised and supervised pre-training for vulnerability patching; and (ii) for the first time, experimenting with different kinds of prompt-tuning for this task. The study required to train/test 23 DL models. We found…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

antonio-mastropaolo/dl-training-vuln-patching
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Digital and Cyber Forensics · Network Security and Intrusion Detection