Evaluating the Performance of ChatGPT for Spam Email Detection
Shijing Si, Yuwei Wu, Le Tang, Yugui Zhang, Jedrek Wosik, and Qinliang, Su

TL;DR
This study evaluates ChatGPT's effectiveness in spam email detection across English and Chinese datasets, comparing it with traditional machine learning models, and analyzes how prompt demonstrations influence its performance.
Contribution
It provides a comprehensive assessment of ChatGPT's capabilities for spam detection in multiple languages and explores the impact of in-context learning variations.
Findings
ChatGPT performs worse than deep learning models on large English datasets.
ChatGPT outperforms traditional models on low-resource Chinese datasets.
The number of demonstrations in prompts affects ChatGPT's spam detection accuracy.
Abstract
Email continues to be a pivotal and extensively utilized communication medium within professional and commercial domains. Nonetheless, the prevalence of spam emails poses a significant challenge for users, disrupting their daily routines and diminishing productivity. Consequently, accurately identifying and filtering spam based on content has become crucial for cybersecurity. Recent advancements in natural language processing, particularly with large language models like ChatGPT, have shown remarkable performance in tasks such as question answering and text generation. However, its potential in spam identification remains underexplored. To fill in the gap, this study attempts to evaluate ChatGPT's capabilities for spam identification in both English and Chinese email datasets. We employ ChatGPT for spam email detection using in-context learning, which requires a prompt instruction with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Personal Information Management and User Behavior · Online Learning and Analytics
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Softmax · WordPiece · Residual Connection · Linear Layer · Weight Decay · Dropout · Layer Normalization · Multi-Head Attention
