Can Large Language Models Automate Phishing Warning Explanations? A Controlled Experiment on Effectiveness and User Perception

Federico Maria Cau; Giuseppe Desolda; Francesco Greco; Lucio Davide Spano; Luca Vigan\`o

arXiv:2507.07916·cs.CR·December 16, 2025

Can Large Language Models Automate Phishing Warning Explanations? A Controlled Experiment on Effectiveness and User Perception

Federico Maria Cau, Giuseppe Desolda, Francesco Greco, Lucio Davide Spano, Luca Vigan\`o

PDF

Open Access

TL;DR

This study evaluates the use of Large Language Models to generate explanations for phishing warnings, finding they can match expert explanations in effectiveness and offer scalable, adaptive solutions for cybersecurity user education.

Contribution

It demonstrates that LLMs can automatically produce effective phishing warning explanations, reducing manual effort and enhancing scalability in cybersecurity defenses.

Findings

01

LLMs produce explanations with protection levels comparable to expert-crafted messages.

02

Claude 3.5 Sonnet slightly reduced click-through rates, indicating potential behavioral impact.

03

Llama 3.3 was perceived as clearer but did not significantly lower click-through rates.

Abstract

Phishing has become a prominent risk in modern cybersecurity, often used to bypass technological defences by exploiting predictable human behaviour. Warning dialogues are a standard mitigation measure, but the lack of explanatory clarity and static content limits their effectiveness. In this paper, we report on our research to assess the capacity of Large Language Models (LLMs) to generate clear, concise, and scalable explanations for phishing warnings. We carried out a large-scale between-subjects user study (N = 750) to compare the influence of warning dialogues supplemented with manually generated explanations against those generated by two LLMs, Claude 3.5 Sonnet and Llama 3.3 70B. We investigated two explanatory styles (feature-based and counterfactual) for their effects on behavioural metrics (click-through rate) and perceptual outcomes (e.g., trust, risk, clarity). The results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Safety Warnings and Signage · Spam and Phishing Detection