An Empirical Study of Vulnerability Detection using Federated Learning
Peiheng Zhou, Ming Hu, Xingrun Quan, Yawen Peng, Xiaofei Xie, Yanxin, Yang, Chengwei Liu, Yueming Wu, Mingsong Chen

TL;DR
This paper evaluates federated learning's effectiveness in vulnerability detection, demonstrating its potential to improve detection performance across various CWEs while highlighting challenges posed by data heterogeneity.
Contribution
It introduces VulFL, a comprehensive evaluation framework for FL-based vulnerability detection, and provides an extensive study on FL's capabilities and configuration strategies for this task.
Findings
FL significantly improves vulnerability detection performance over independent training.
Data heterogeneity limits FL effectiveness in vulnerability detection.
Configuration strategies impact FL performance in vulnerability detection.
Abstract
Although Deep Learning (DL) methods becoming increasingly popular in vulnerability detection, their performance is seriously limited by insufficient training data. This is mainly because few existing software organizations can maintain a complete set of high-quality samples for DL-based vulnerability detection. Due to the concerns about privacy leakage, most of them are reluctant to share data, resulting in the data silo problem. Since enables collaboratively model training without data sharing, Federated Learning (FL) has been investigated as a promising means of addressing the data silo problem in DL-based vulnerability detection. However, since existing FL-based vulnerability detection methods focus on specific applications, it is still far unclear i) how well FL adapts to common vulnerability detection tasks and ii) how to design a high-performance FL solution for a specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection
MethodsSparse Evolutionary Training · Focus
