PoolFlip: A Multi-Agent Reinforcement Learning Security Environment for Cyber Defense
Xavier Cadet, Simona Boboila, Sie Hendrata Dharmawan, Alina Oprea, Peter Chin

TL;DR
This paper introduces PoolFlip, a multi-agent reinforcement learning environment based on FlipIt for cyber defense, and proposes Flip-PSRO, a training method that enhances defender adaptability against evolving adversaries.
Contribution
We develop PoolFlip, a flexible environment for multi-agent RL in cyber defense, and introduce Flip-PSRO, a population-based training approach for robust defender strategies.
Findings
Flip-PSRO defenders are twice as effective against unseen attacks.
The environment enables efficient learning for adaptive cyber defense agents.
Ownership-based utility functions improve control and performance.
Abstract
Cyber defense requires automating defensive decision-making under stealthy, deceptive, and continuously evolving adversarial strategies. The FlipIt game provides a foundational framework for modeling interactions between a defender and an advanced adversary that compromises a system without being immediately detected. In FlipIt, the attacker and defender compete to control a shared resource by performing a Flip action and paying a cost. However, the existing FlipIt frameworks rely on a small number of heuristics or specialized learning techniques, which can lead to brittleness and the inability to adapt to new attacks. To address these limitations, we introduce PoolFlip, a multi-agent gym environment that extends the FlipIt game to allow efficient learning for attackers and defenders. Furthermore, we propose Flip-PSRO, a multi-agent reinforcement learning (MARL) approach that leverages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
