MalCL: Leveraging GAN-Based Generative Replay to Combat Catastrophic Forgetting in Malware Classification
Jimin Park, AHyun Ji, Minji Park, Mohammad Saidur Rahman, Se Eun Oh

TL;DR
This paper introduces MalCL, a GAN-based continual learning system that effectively mitigates catastrophic forgetting in malware classification by generating high-quality synthetic samples and employing innovative replay sample selection, leading to significant accuracy improvements.
Contribution
MalCL is the first to combine GANs with feature matching loss and novel sample selection schemes for malware continual learning, enhancing performance on evolving malware datasets.
Findings
Achieves 55% accuracy on Windows malware, 28% higher than previous methods.
Demonstrates effective mitigation of catastrophic forgetting in class-incremental malware learning.
Provides practical insights and a publicly available implementation for future research.
Abstract
Continual Learning (CL) for malware classification tackles the rapidly evolving nature of malware threats and the frequent emergence of new types. Generative Replay (GR)-based CL systems utilize a generative model to produce synthetic versions of past data, which are then combined with new data to retrain the primary model. Traditional machine learning techniques in this domain often struggle with catastrophic forgetting, where a model's performance on old data degrades over time. In this paper, we introduce a GR-based CL system that employs Generative Adversarial Networks (GANs) with feature matching loss to generate high-quality malware samples. Additionally, we implement innovative selection schemes for replay samples based on the model's hidden representations. Our comprehensive evaluation across Windows and Android malware datasets in a class-incremental learning scenario --…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Anomaly Detection Techniques and Applications · Network Security and Intrusion Detection
