Adaptive Learning for Moving Target defence: Enhancing Cybersecurity Strategies

Mandar Datar (CEA-LETI); Yann Dujardin

arXiv:2508.17945·cs.GT·August 26, 2025

Adaptive Learning for Moving Target defence: Enhancing Cybersecurity Strategies

Mandar Datar (CEA-LETI), Yann Dujardin

PDF

TL;DR

This paper models Moving Target Defense as a stochastic game, proposing a threshold-based reinforcement learning approach to optimize strategies for both attackers and defenders, thereby improving cybersecurity resilience.

Contribution

It introduces a structure-aware policy gradient algorithm for MTD, enabling adaptive, equilibrium-seeking strategies in cybersecurity defense.

Findings

01

Optimal strategies follow a threshold structure

02

The proposed algorithm converges to Nash equilibrium

03

Enhanced defender adaptability improves security outcomes

Abstract

In this work, we model Moving Target Defence (MTD) as a partially observable stochastic game between an attacker and a defender. The attacker tries to compromise the system through probing actions, while the defender minimizes the risk by reimaging the system, balancing between performance cost and security level. We demonstrate that the optimal strategies for both players follow a threshold structure. Based on this insight, we propose a structure-aware policy gradient reinforcement learning algorithm that helps both players converge to the Nash equilibrium. This approach enhances the defender's ability to adapt and effectively counter evolving threats, improving the overall security of the system. Finally, we validate the proposed method through numerical simulations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.