Explainable phishing website detection for secure and sustainable cyber infrastructure
Tanzila Kehkashan, Maha Abdelhaq, Ahmad Sami Al-Shamayleh, Nazish Huda, Imran Ashraf Yaseen, Abdelmuttlib Ibrahim Abdalla Ahmed, Adnan Akhunzada

TL;DR
This paper proposes an explainable phishing detection system using machine learning and SHAP to improve accuracy and interpretability for secure cyber infrastructure.
Contribution
The novelty lies in using SHAP-based feature selection with URL-based models for interpretable and accurate phishing detection.
Findings
The random forest model achieved 97% accuracy in phishing detection.
SHAP improved model interpretability by highlighting important URL-based features.
The proposed system is efficient and suitable for resource-constrained devices.
Abstract
Phishing is a social engineering attack and a type of cybercrime that is dangerously and constantly on the rise. Phishing attacks can impact various sectors, including governmental, social, financial, and individual businesses. Traditional methods of identifying phishing websites, such as blacklist and heuristic approaches, often fail to provide sufficient protection. Moreover, traditional techniques that combine URLs, webpage content, and external features are time-consuming, require substantial computing power, and are unsuitable for devices with limited resources. Moreover, previous research has often overlooked the critical role of identifying which features are important for detection and their impact on outcomes. Traditional methods might not fully capture the significance of individual features. To overcome this issue, this research applies feature selection techniques,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Cybercrime and Law Enforcement Studies · Misinformation and Its Impacts
