DRMD: Deep Reinforcement Learning for Malware Detection under Concept Drift

Shae McFadden; Myles Foley; Mario D'Onghia; Chris Hicks; Vasilios Mavroudis; Nicola Paoletti; Fabio Pierazzi

arXiv:2508.18839·cs.LG·March 23, 2026

DRMD: Deep Reinforcement Learning for Malware Detection under Concept Drift

Shae McFadden, Myles Foley, Mario D'Onghia, Chris Hicks, Vasilios Mavroudis, Nicola Paoletti, Fabio Pierazzi

PDF

1 Video

TL;DR

This paper introduces a deep reinforcement learning approach for malware detection that adapts to concept drift, improving long-term performance and resilience in dynamic Android malware environments.

Contribution

It formulates malware detection as a Markov Decision Process and trains a DRL agent to optimize detection and manual labeling decisions under concept drift.

Findings

01

DRMD outperforms standard classifiers in AUT metrics.

02

DRMD achieves up to 10.90 AUT improvement.

03

Demonstrates DRL's effectiveness in malware detection under drift.

Abstract

Malware detection in real-world settings must deal with evolving threats, limited labeling budgets, and uncertain predictions. Traditional classifiers, without additional mechanisms, struggle to maintain performance under concept drift in malware domains, as their supervised learning formulation cannot optimize when to defer decisions to manual labeling and adaptation. Modern malware detection pipelines combine classifiers with monthly active learning (AL) and rejection mechanisms to mitigate the impact of concept drift. In this work, we develop a novel formulation of malware detection as a one-step Markov Decision Process and train a deep reinforcement learning (DRL) agent, simultaneously optimizing sample classification performance and rejecting high-risk samples for manual labeling. We evaluated the joint detection and drift mitigation policy learned by the DRL-based Malware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

DRMD: Deep Reinforcement Learning for Malware Detection Under Concept Drift· underline