Explainable Transformer-Based Email Phishing Classification with Adversarial Robustness

Sajad U P

arXiv:2511.12085·cs.CR·February 9, 2026

Explainable Transformer-Based Email Phishing Classification with Adversarial Robustness

Sajad U P

PDF

Open Access

TL;DR

This paper introduces a hybrid transformer-based framework for email phishing detection that combines adversarial training and explainability techniques to improve robustness and transparency against AI-generated threats.

Contribution

It proposes a novel approach integrating DistilBERT with adversarial training and LIME XAI, along with Flan-T5 for user-friendly explanations, enhancing phishing detection robustness and interpretability.

Findings

01

Improved resilience of phishing detection against adversarial attacks.

02

Enhanced transparency through LIME explanations.

03

Effective generation of plain-language security narratives.

Abstract

Phishing and related cyber threats are becoming more varied and technologically advanced. Among these, email-based phishing remains the most dominant and persistent threat. These attacks exploit human vulnerabilities to disseminate malware or gain unauthorized access to sensitive information. Deep learning (DL) models, particularly transformer-based models, have significantly enhanced phishing mitigation through their contextual understanding of language. However, some recent threats, specifically Artificial Intelligence (AI)-generated phishing attacks, are reducing the overall system resilience of phishing detectors. In response, adversarial training has shown promise against AI-generated phishing threats. This study presents a hybrid approach that uses DistilBERT, a smaller, faster, and lighter version of the BERT transformer model for email classification. Robustness against…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Misinformation and Its Impacts · Adversarial Robustness in Machine Learning