Phishing Attacks and Websites Classification Using Machine Learning and   Multiple Datasets (A Comparative Analysis)

Sohail Ahmed Khan; Wasiq Khan; Abir Hussain

arXiv:2101.02552·cs.CR·January 8, 2021

Phishing Attacks and Websites Classification Using Machine Learning and Multiple Datasets (A Comparative Analysis)

Sohail Ahmed Khan, Wasiq Khan, Abir Hussain

PDF

TL;DR

This paper compares various machine learning algorithms for phishing website detection across multiple datasets, highlighting the effectiveness of random forest and neural networks with over 97% accuracy.

Contribution

It provides a comprehensive analysis of different ML algorithms and feature importance for phishing detection, with a focus on performance across diverse datasets.

Findings

01

Random forest and neural networks outperform other algorithms.

02

Achieved over 97% accuracy in phishing classification.

03

Feature selection improves model performance.

Abstract

Phishing attacks are the most common type of cyber-attacks used to obtain sensitive information and have been affecting individuals as well as organisations across the globe. Various techniques have been proposed to identify the phishing attacks specifically, deployment of machine intelligence in recent years. However, the deployed algorithms and discriminating factors are very diverse in existing works. In this study, we present a comprehensive analysis of various machine learning algorithms to evaluate their performances over multiple datasets. We further investigate the most significant features within multiple datasets and compare the classification performance with the reduced dimensional datasets. The statistical results indicate that random forest and artificial neural network outperform other classification algorithms, achieving over 97% accuracy using the identified features.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.