Attackers Strike Back? Not Anymore -- An Ensemble of RL Defenders Awakens for APT Detection
Sidahmed Benabderrahmane, Talal Rahwan

TL;DR
This paper presents an adaptive APT detection framework combining auto-encoders, multiple reinforcement learning agents, and active learning to improve detection of stealthy, evolving cyber threats.
Contribution
It introduces a multi-agent RL ensemble with active learning for dynamic, robust APT detection, addressing limitations of static traditional systems.
Findings
Enhanced detection accuracy over baseline methods
Effective adaptation to evolving attack strategies
Robust ensemble voting improves decision reliability
Abstract
Advanced Persistent Threats (APTs) represent a growing menace to modern digital infrastructure. Unlike traditional cyberattacks, APTs are stealthy, adaptive, and long-lasting, often bypassing signature-based detection systems. This paper introduces a novel framework for APT detection that unites deep learning, reinforcement learning (RL), and active learning into a cohesive, adaptive defense system. Our system combines auto-encoders for latent behavioral encoding with a multi-agent ensemble of RL-based defenders, each trained to distinguish between benign and malicious process behaviors. We identify a critical challenge in existing detection systems: their static nature and inability to adapt to evolving attack strategies. To this end, our architecture includes multiple RL agents (Q-Learning, PPO, DQN, adversarial defenders), each analyzing latent vectors generated by an auto-encoder.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
