Towards Text-based Phishing Detection
Gilchan Park, Julia M. Taylor

TL;DR
This study improves text-based phishing detection accuracy using a modified algorithm with readily available resources, highlighting potential for further reduction in false positives with semantic analysis.
Contribution
It presents a modified phishing detection algorithm that outperforms previous non-semantic methods in accuracy, using accessible tools.
Findings
Better phishing email recognition accuracy
Slightly higher false positive rate
Potential for semantic analysis to improve results
Abstract
This paper reports on an experiment into text-based phishing detection using readily available resources and without the use of semantics. The developed algorithm is a modified version of previously published work that works with the same tools. The results obtained in recognizing phishing emails are considerably better than the previously reported work; but the rate of text falsely identified as phishing is slightly worse. It is expected that adding semantic component will reduce the false positive rate while preserving the detection accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Text and Document Classification Technologies · Misinformation and Its Impacts
