Enabling Privacy-Preserving Cyber Threat Detection with Federated Learning
Yu Bi, Yekai Li, Xuan Feng, Xianghang Mi

TL;DR
This paper investigates the feasibility of federated learning for privacy-preserving cyber threat detection, demonstrating comparable performance to centralized models and resilience against attacks, while addressing efficiency challenges.
Contribution
It systematically evaluates federated learning for threat detection, analyzing effectiveness, robustness, and efficiency under realistic privacy and attack scenarios.
Findings
FL achieves similar performance to centralized models.
Non-IID data has minor impact on FL effectiveness.
FL is resilient to data and model poisoning attacks.
Abstract
Despite achieving good performance and wide adoption, machine learning based security detection models (e.g., malware classifiers) are subject to concept drift and evasive evolution of attackers, which renders up-to-date threat data as a necessity. However, due to enforcement of various privacy protection regulations (e.g., GDPR), it is becoming increasingly challenging or even prohibitive for security vendors to collect individual-relevant and privacy-sensitive threat datasets, e.g., SMS spam/non-spam messages from mobile devices. To address such obstacles, this study systematically profiles the (in)feasibility of federated learning for privacy-preserving cyber threat detection in terms of effectiveness, byzantine resilience, and efficiency. This is made possible by the build-up of multiple threat datasets and threat detection models, and more importantly, the design of realistic and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Adversarial Robustness in Machine Learning
