Score: 0

Explainable Transformer-Based Email Phishing Classification with Adversarial Robustness

Published: November 15, 2025 | arXiv ID: 2511.12085v1

By: Sajad U P

Potential Business Impact:

Catches tricky fake emails, even AI ones.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Phishing and related cyber threats are becoming more varied and technologically advanced. Among these, email-based phishing remains the most dominant and persistent threat. These attacks exploit human vulnerabilities to disseminate malware or gain unauthorized access to sensitive information. Deep learning (DL) models, particularly transformer-based models, have significantly enhanced phishing mitigation through their contextual understanding of language. However, some recent threats, specifically Artificial Intelligence (AI)-generated phishing attacks, are reducing the overall system resilience of phishing detectors. In response, adversarial training has shown promise against AI-generated phishing threats. This study presents a hybrid approach that uses DistilBERT, a smaller, faster, and lighter version of the BERT transformer model for email classification. Robustness against text-based adversarial perturbations is reinforced using Fast Gradient Method (FGM) adversarial training. Furthermore, the framework integrates the LIME Explainable AI (XAI) technique to enhance the transparency of the DistilBERT architecture. The framework also uses the Flan-T5-small language model from Hugging Face to generate plain-language security narrative explanations for end-users. This combined approach ensures precise phishing classification while providing easily understandable justifications for the model's decisions.

Robust ML-based Detection of Conventional, LLM-Generated, and Adversarial Phishing Emails Using Advanced Text Preprocessing

Cryptography and Security

Stops fake emails from tricking you.

13 Oct 2025 0

89%

Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability

Cryptography and Security

Helps computers spot fake emails better.

16 Jun 2025 1

89%

Deep Reinforcement Learning for Phishing Detection with Transformer-Based Semantic Features

Machine Learning (CS)

Stops fake websites from stealing your money.

7 Dec 2025 0

View PDF Login to Bookmark

Page Count

9 pages

Explainable Transformer-Based Email Phishing Classification with Adversarial Robustness

Catches tricky fake emails, even AI ones.

Technical Abstract

Robust ML-based Detection of Conventional, LLM-Generated, and Adversarial Phishing Emails Using Advanced Text Preprocessing

Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability

Deep Reinforcement Learning for Phishing Detection with Transformer-Based Semantic Features