An intelligent classification model for phishing email detection

Adwan Yasin; Abdelmunem Abuhasan

arXiv:1608.02196·cs.CR·August 9, 2016·1 cites

An intelligent classification model for phishing email detection

Adwan Yasin, Abdelmunem Abuhasan

PDF

Open Access

TL;DR

This paper introduces an advanced machine learning-based model for detecting phishing emails, utilizing text processing, phishing term weighting, and knowledge discovery techniques, achieving high accuracy rates surpassing previous methods.

Contribution

It proposes a novel phishing detection model incorporating phishing term weighting, text enrichment with WordNet, and evaluates multiple classifiers, achieving record-high accuracy.

Findings

01

Random Forest achieved 0.991 accuracy

02

J48 classifier achieved 0.984 accuracy

03

Model outperforms similar existing techniques

Abstract

Phishing attacks are one of the trending cyber attacks that apply socially engineered messages that are communicated to people from professional hackers aiming at fooling users to reveal their sensitive information, the most popular communication channel to those messages is through users emails. This paper presents an intelligent classification model for detecting phishing emails using knowledge discovery, data mining and text processing techniques. This paper introduces the concept of phishing terms weighting which evaluates the weight of phishing terms in each email. The pre processing phase is enhanced by applying text stemming and WordNet ontology to enrich the model with word synonyms. The model applied the knowledge discovery procedures using five popular classification algorithms and achieved a notable enhancement in classification accuracy, 0.991 accuracy was achieved using the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Misinformation and Its Impacts · User Authentication and Security Systems