Counteracting Concept Drift by Learning with Future Malware Predictions
Branislav Bosansky, Lada Hospodkova, Michal Najman, Maria Rigaki,, Elnaz Babayeva, Viliam Lisy

TL;DR
This paper explores using GANs to generate future malware samples to counteract concept drift, showing that GAN-based predictions improve classifier accuracy on unseen data, unlike adversarial training.
Contribution
It introduces a novel approach of using GANs for predicting future malware samples to mitigate concept drift in malware detection.
Findings
GANs effectively predict future malware samples.
GAN-based predictions improve classifier accuracy on new data.
Adversarial training yields more robust classifiers but less accurate future sample prediction.
Abstract
The accuracy of deployed malware-detection classifiers degrades over time due to changes in data distributions and increasing discrepancies between training and testing data. This phenomenon is known as the concept drift. While the concept drift can be caused by various reasons in general, new malicious files are created by malware authors with a clear intention of avoiding detection. The existence of the intention opens a possibility for predicting such future samples. Including predicted samples in training data should consequently increase the accuracy of the classifiers on new testing data. We compare two methods for predicting future samples: (1) adversarial training and (2) generative adversarial networks (GANs). The first method explicitly seeks for adversarial examples against the classifier that are then used as a part of training data. Similarly, GANs also generate synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Anomaly Detection Techniques and Applications
