CIPHER: Cybersecurity Intelligent Penetration-testing Helper for Ethical   Researcher

Derry Pratama; Naufal Suryanto; Andro Aprila Adiputra; Thi-Thu-Huong; Le; Ahmada Yusril Kadiptya; Muhammad Iqbal; and Howon Kim

arXiv:2408.11650·cs.CR·November 7, 2024

CIPHER: Cybersecurity Intelligent Penetration-testing Helper for Ethical Researcher

Derry Pratama, Naufal Suryanto, Andro Aprila Adiputra, Thi-Thu-Huong, Le, Ahmada Yusril Kadiptya, Muhammad Iqbal, and Howon Kim

PDF

Open Access 1 Repo

TL;DR

CIPHER is a specialized large language model designed to assist ethical cybersecurity researchers in penetration testing, trained on extensive hacking literature and evaluated with a novel benchmark to improve AI-driven cybersecurity tools.

Contribution

We introduce CIPHER, a large language model tailored for penetration testing, and a new FARR Flow augmentation method for benchmarking AI in cybersecurity tasks.

Findings

01

CIPHER outperforms similar-sized models and larger state-of-the-art models in penetration testing tasks.

02

FARR Flow augmentation provides a realistic benchmark for evaluating AI in penetration testing.

03

Current general LLMs are insufficient for effective penetration testing guidance.

Abstract

Penetration testing, a critical component of cybersecurity, typically requires extensive time and effort to find vulnerabilities. Beginners in this field often benefit from collaborative approaches with the community or experts. To address this, we develop CIPHER (Cybersecurity Intelligent Penetration-testing Helper for Ethical Researchers), a large language model specifically trained to assist in penetration testing tasks. We trained CIPHER using over 300 high-quality write-ups of vulnerable machines, hacking techniques, and documentation of open-source penetration testing tools. Additionally, we introduced the Findings, Action, Reasoning, and Results (FARR) Flow augmentation, a novel method to augment penetration testing write-ups to establish a fully automated pentesting simulation benchmark tailored for large language models. This approach fills a significant gap in traditional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ibndias/cipher
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection

MethodsLLaMA