TL;DR
This paper investigates using large language models like GPT-3.5 to assist penetration testers by planning security tests and actively hunting vulnerabilities through automated interactions with virtual machines.
Contribution
It demonstrates the feasibility of integrating LLMs into penetration testing workflows for task planning and vulnerability exploitation, highlighting initial promising results.
Findings
LLMs can assist in high-level security testing planning.
Automated vulnerability hunting with LLMs shows potential.
The approach opens new avenues for AI-augmented cybersecurity.
Abstract
The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers with AI sparring partners. We explore the feasibility of supplementing penetration testers with AI models for two distinct use cases: high-level task planning for security testing assignments and low-level vulnerability hunting within a vulnerable virtual machine. For the latter, we implemented a closed-feedback loop between LLM-generated low-level actions with a vulnerable virtual machine (connected through SSH) and allowed the LLM to analyze the machine state for vulnerabilities and suggest concrete attack vectors which were automatically executed within the virtual machine. We discuss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
