Multilingual Email Phishing Attacks Detection using OSINT and Machine   Learning

Panharith An; Rana Shafi; Tionge Mughogho; Onyango Allan Onyango

arXiv:2501.08723·cs.CR·January 16, 2025

Multilingual Email Phishing Attacks Detection using OSINT and Machine Learning

Panharith An, Rana Shafi, Tionge Mughogho, Onyango Allan Onyango

PDF

Open Access

TL;DR

This paper presents a method combining OSINT tools and machine learning to detect multilingual email phishing attacks, achieving high accuracy and addressing language diversity in cybersecurity detection.

Contribution

It introduces an integrated approach using OSINT features with ML models for multilingual phishing detection, improving accuracy over traditional methods.

Findings

01

Random Forest achieved 97.37% accuracy on English and Arabic datasets.

02

OSINT features improved detection accuracy compared to baseline models.

03

Multilingual datasets enhance the robustness of phishing detection models.

Abstract

Email phishing remains a prevalent cyber threat, targeting victims to extract sensitive information or deploy malicious software. This paper explores the integration of open-source intelligence (OSINT) tools and machine learning (ML) models to enhance phishing detection across multilingual datasets. Using Nmap and theHarvester, this study extracted 17 features, including domain names, IP addresses, and open ports, to improve detection accuracy. Multilingual email datasets, including English and Arabic, were analyzed to address the limitations of ML models trained predominantly on English data. Experiments with five classification algorithms: Decision Tree, Random Forest, Support Vector Machine, XGBoost, and Multinomial Na\"ive Bayes. It revealed that Random Forest achieved the highest performance, with an accuracy of 97.37% for both English and Arabic datasets. For OSINT-enhanced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection