ACDZero: MCTS Agent for Mastering Automated Cyber Defense
Yu Li, Sizhe Tang, Rongqian Chen, Fei Xu Yu, Guangyu Jiang, Mahdi Imani, Nathaniel D. Bastian, Tian Lan

TL;DR
ACDZero introduces a planning-based MCTS agent utilizing graph neural networks for sample-efficient automated cyber defense, demonstrating improved performance over existing reinforcement learning methods in complex network scenarios.
Contribution
The paper presents a novel MCTS-based approach with graph neural network embeddings for automated cyber defense, enhancing exploration efficiency and decision quality.
Findings
Outperforms RL baselines in diverse CC4 scenarios.
Improves defense reward and robustness.
Effectively models complex network relationships.
Abstract
Automated cyber defense (ACD) seeks to protect computer networks with minimal or no human intervention, reacting to intrusions by taking corrective actions such as isolating hosts, resetting services, deploying decoys, or updating access controls. However, existing approaches for ACD, such as deep reinforcement learning (RL), often face difficult exploration in complex networks with large decision/state spaces and thus require an expensive amount of samples. Inspired by the need to learn sample-efficient defense policies, we frame ACD in CAGE Challenge 4 (CAGE-4 / CC4) as a context-based partially observable Markov decision problem and propose a planning-centric defense policy based on Monte Carlo Tree Search (MCTS). It explicitly models the exploration-exploitation tradeoff in ACD and uses statistical sampling to guide exploration and decision making. We make novel use of graph neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation and Cyber Security · Software-Defined Networks and 5G · Adversarial Robustness in Machine Learning
