Talking Like a Phisher: LLM-Based Attacks on Voice Phishing Classifiers
Wenhao Li, Selvakumar Manickam, Yung-wey Chong, Shankar Karuppayah

TL;DR
This paper demonstrates that large language models can craft adversarial voice phishing transcripts that evade detection by machine learning classifiers, highlighting vulnerabilities in current vishing detection systems.
Contribution
It introduces a systematic attack pipeline using LLMs to generate deceptive vishing transcripts that bypass classifiers while maintaining semantic integrity.
Findings
LLM-generated transcripts significantly reduce classifier accuracy by up to 30.96%.
Transcripts maintain high semantic similarity as measured by BERTScore.
The attack process is time-efficient, averaging under 9 seconds per transcript.
Abstract
Voice phishing (vishing) remains a persistent threat in cybersecurity, exploiting human trust through persuasive speech. While machine learning (ML)-based classifiers have shown promise in detecting malicious call transcripts, they remain vulnerable to adversarial manipulations that preserve semantic content. In this study, we explore a novel attack vector where large language models (LLMs) are leveraged to generate adversarial vishing transcripts that evade detection while maintaining deceptive intent. We construct a systematic attack pipeline that employs prompt engineering and semantic obfuscation to transform real-world vishing scripts using four commercial LLMs. The generated transcripts are evaluated against multiple ML classifiers trained on a real-world Korean vishing dataset (KorCCViD) with statistical testing. Our experiments reveal that LLM-generated transcripts are both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Hate Speech and Cyberbullying Detection · Internet Traffic Analysis and Secure E-voting
