NASimJax: GPU-Accelerated Policy Learning Framework for Penetration Testing
Raphael Simon, Jos\'e Carrasquel, Wim Mees, Pieter Libin

TL;DR
NASimJax is a GPU-accelerated framework for reinforcement learning in penetration testing, enabling large-scale experiments and improved policy generalization in complex network scenarios.
Contribution
It introduces NASimJax, a JAX-based high-throughput simulator, and new methods like 2SAS for scalable action spaces and scenario generation for better policy generalization.
Findings
Prioritized Level Replay outperforms Domain Randomization at scale.
Training on sparser topologies improves out-of-distribution generalization.
Two-stage action decomposition (2SAS) outperforms flat action masking.
Abstract
Penetration testing, the practice of simulating cyberattacks to identify vulnerabilities, is a complex sequential decision-making task that is inherently partially observable and features large action spaces. Training reinforcement learning (RL) policies for this domain faces a fundamental bottleneck: existing simulators are too slow to train on realistic network scenarios at scale, resulting in policies that fail to generalize. We present NASimJax, a complete JAX-based reimplementation of the Network Attack Simulator (NASim), achieving up to 100x higher environment throughput than the original simulator. By running the entire training pipeline on hardware accelerators, NASimJax enables experimentation on larger networks under fixed compute budgets that were previously infeasible. We formulate automated penetration testing as a Contextual POMDP and introduce a network generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation and Cyber Security · Security and Verification in Computing · Network Security and Intrusion Detection
