CIPHER: Cybersecurity Intelligent Penetration-testing Helper for Ethical Researcher
Derry Pratama, Naufal Suryanto, Andro Aprila Adiputra, Thi-Thu-Huong, Le, Ahmada Yusril Kadiptya, Muhammad Iqbal, and Howon Kim

TL;DR
CIPHER is a specialized large language model designed to assist ethical cybersecurity researchers in penetration testing, trained on extensive hacking literature and evaluated with a novel benchmark to improve AI-driven cybersecurity tools.
Contribution
We introduce CIPHER, a large language model tailored for penetration testing, and a new FARR Flow augmentation method for benchmarking AI in cybersecurity tasks.
Findings
CIPHER outperforms similar-sized models and larger state-of-the-art models in penetration testing tasks.
FARR Flow augmentation provides a realistic benchmark for evaluating AI in penetration testing.
Current general LLMs are insufficient for effective penetration testing guidance.
Abstract
Penetration testing, a critical component of cybersecurity, typically requires extensive time and effort to find vulnerabilities. Beginners in this field often benefit from collaborative approaches with the community or experts. To address this, we develop CIPHER (Cybersecurity Intelligent Penetration-testing Helper for Ethical Researchers), a large language model specifically trained to assist in penetration testing tasks. We trained CIPHER using over 300 high-quality write-ups of vulnerable machines, hacking techniques, and documentation of open-source penetration testing tools. Additionally, we introduced the Findings, Action, Reasoning, and Results (FARR) Flow augmentation, a novel method to augment penetration testing write-ups to establish a fully automated pentesting simulation benchmark tailored for large language models. This approach fills a significant gap in traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection
MethodsLLaMA
