Assessing Spear-Phishing Website Generation in Large Language Model Coding Agents
Tailia Malloy, Tegawende F. Bissyande

TL;DR
This paper evaluates large language models' ability to generate spear-phishing websites, highlighting risks and providing a dataset to aid in understanding and defending against such misuse in cybersecurity.
Contribution
It introduces a dataset of 200 spear-phishing website code bases and analyzes LLMs' capabilities and tendencies to produce malicious code, a novel focus in cybersecurity assessment of LLMs.
Findings
Certain LLM metrics correlate with phishing site generation ability
Models vary significantly in willingness to produce malicious code
Dataset enables further research on LLM misuse in cyberattacks
Abstract
Large Language Models are expanding beyond being a tool humans use and into independent agents that can observe an environment, reason about solutions to problems, make changes that impact those environments, and understand how their actions impacted their environment. One of the most common applications of these LLM Agents is in computer programming, where agents can successfully work alongside humans to generate code while controlling programming environments or networking systems. However, with the increasing ability and complexity of these agents comes dangers about the potential for their misuse. A concerning application of LLM agents is in the domain cybersecurity, where they have the potential to greatly expand the threat imposed by attacks such as social engineering. This is due to the fact that LLM Agents can work autonomously and perform many tasks that would normally require…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Cybercrime and Law Enforcement Studies · Misinformation and Its Impacts
