PoolFlip: A Multi-Agent Reinforcement Learning Security Environment for Cyber Defense

Xavier Cadet; Simona Boboila; Sie Hendrata Dharmawan; Alina Oprea; Peter Chin

arXiv:2508.19488·cs.LG·August 28, 2025

PoolFlip: A Multi-Agent Reinforcement Learning Security Environment for Cyber Defense

Xavier Cadet, Simona Boboila, Sie Hendrata Dharmawan, Alina Oprea, Peter Chin

PDF

TL;DR

This paper introduces PoolFlip, a multi-agent reinforcement learning environment based on FlipIt for cyber defense, and proposes Flip-PSRO, a training method that enhances defender adaptability against evolving adversaries.

Contribution

We develop PoolFlip, a flexible environment for multi-agent RL in cyber defense, and introduce Flip-PSRO, a population-based training approach for robust defender strategies.

Findings

01

Flip-PSRO defenders are twice as effective against unseen attacks.

02

The environment enables efficient learning for adaptive cyber defense agents.

03

Ownership-based utility functions improve control and performance.

Abstract

Cyber defense requires automating defensive decision-making under stealthy, deceptive, and continuously evolving adversarial strategies. The FlipIt game provides a foundational framework for modeling interactions between a defender and an advanced adversary that compromises a system without being immediately detected. In FlipIt, the attacker and defender compete to control a shared resource by performing a Flip action and paying a cost. However, the existing FlipIt frameworks rely on a small number of heuristics or specialized learning techniques, which can lead to brittleness and the inability to adapt to new attacks. To address these limitations, we introduce PoolFlip, a multi-agent gym environment that extends the FlipIt game to allow efficient learning for attackers and defenders. Furthermore, we propose Flip-PSRO, a multi-agent reinforcement learning (MARL) approach that leverages…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.