Hacking, The Lazy Way: LLM Augmented Pentesting

Dhruva Goyal; Sitaraman Subramanian; Aditya Peela; Nisha P. Shetty

arXiv:2409.09493·cs.CR·May 20, 2025·2 cites

Hacking, The Lazy Way: LLM Augmented Pentesting

Dhruva Goyal, Sitaraman Subramanian, Aditya Peela, Nisha P. Shetty

PDF

Open Access

TL;DR

This paper presents LLM Augmented Pentesting with a tool called Pentest Copilot, which integrates GPT-4-turbo and retrieval-augmented generation to automate and improve penetration testing tasks, bridging automation and human expertise.

Contribution

Introducing a novel LLM-based framework and tool for penetration testing that enhances automation, decision-making, and real-world applicability in cybersecurity.

Findings

01

Significantly improved task completion rates in pentesting.

02

Effective reduction of hallucinations through RAG.

03

Enhanced decision-making with chain of thought mechanisms.

Abstract

In our research, we introduce a new concept called "LLM Augmented Pentesting" demonstrated with a tool named "Pentest Copilot," that revolutionizes the field of ethical hacking by integrating Large Language Models (LLMs) into penetration testing workflows, leveraging the advanced GPT-4-turbo model. Our approach focuses on overcoming the traditional resistance to automation in penetration testing by employing LLMs to automate specific sub-tasks while ensuring a comprehensive understanding of the overall testing process. Pentest Copilot showcases remarkable proficiency in tasks such as utilizing testing tools, interpreting outputs, and suggesting follow-up actions, efficiently bridging the gap between automated systems and human expertise. By integrating a "chain of thought" mechanism, Pentest Copilot optimizes token usage and enhances decision-making processes, leading to more accurate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Rights Management and Security

MethodsFocus