Can LLMs Hack Enterprise Networks? -- Replicated Computational Results (RCR) Report
Andreas Happe, J\"urgen Cito

TL;DR
This paper empirically evaluates the ability of various large language models to perform penetration testing on enterprise networks, specifically Microsoft Active Directory simulations, providing reproducible artifacts and analysis tools.
Contribution
It offers a replicated computational framework for assessing LLMs in enterprise network hacking scenarios, enhancing transparency and reproducibility.
Findings
LLMs can simulate penetration testing with varying success rates
The report provides detailed artifacts and scripts for replication
Insights into LLM capabilities in cybersecurity contexts
Abstract
This is the Replicated Computational Results (RCR) Report for the paper ``Can LLMs Hack Enterprise Networks?" The paper empirically investigates the efficacy and effectiveness of different LLMs for penetration-testing enterprise networks, i.e., Microsoft Active Directory Assumed-Breach Simulations. This RCR report describes the artifacts used in the paper, how to create an evaluation setup, and highlights the analysis scripts provided within our prototype.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software-Defined Networks and 5G · Information and Cyber Security
