Symbolic analysis meets federated learning to enhance malware identifier
Khanh Huu The Dam, Charles-Henry Bertrand Van Ouytsel, Axel, Legay

TL;DR
This paper introduces a federated learning approach utilizing symbolic behavioral graphs and deep learning to improve malware detection accuracy while preserving data privacy across distributed sources.
Contribution
It presents a novel federated learning system that uses behavioral graphs and deep learning for malware identification, addressing data privacy and collection challenges.
Findings
Achieved 85% accuracy on homogeneous graph data.
Achieved 93% accuracy on inhomogeneous graph data.
Demonstrated effective privacy-preserving malware detection.
Abstract
Over past years, the manually methods to create detection rules were no longer practical in the anti-malware product since the number of malware threats has been growing. Thus, the turn to the machine learning approaches is a promising way to make the malware recognition more efficient. The traditional centralized machine learning requires a large amount of data to train a model with excellent performance. To boost the malware detection, the training data might be on various kind of data sources such as data on host, network and cloud-based anti-malware components, or even, data from different enterprises. To avoid the expenses of data collection as well as the leakage of private data, we present a federated learning system to identify malwares through the behavioural graphs, i.e., system call dependency graphs. It is based on a deep learning model including a graph autoencoder and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Complex Network Analysis Techniques
