# Meta-learner-based frameworks for interpretable email spam detection

**Authors:** Meghana Kshirsagar, Vedant Rathi, Conor Ryan

PMC · DOI: 10.3389/frai.2025.1569804 · Frontiers in Artificial Intelligence · 2025-10-21

## TL;DR

This paper introduces a new meta-learning approach for detecting spam emails that outperforms existing methods and adapts well to new data.

## Contribution

A novel meta-learner framework for spam detection that surpasses existing models in accuracy and generalization.

## Key findings

- The meta-learner achieved 0.9905 accuracy and 0.9991 AUC on a hybrid spam dataset.
- It outperformed existing meta-learning models with better generalization and lower computational complexity.
- In zero-shot testing, it achieved 0.8970 spam sensitivity and 0.7605 AUC on unseen data.

## Abstract

With the increasing reliance on digital communication, email has become an essential tool for personal and professional correspondence. However, despite its numerous benefits, digital communication faces significant challenges, particularly the prevalence of spam emails. Effective spam email classification systems are crucial to mitigate these issues by automatically identifying and filtering out unwanted messages, enhancing the efficiency of email communication.

We compare five traditional machine-learning and five deep-learning spam classifiers against a novel meta-learner, evaluating how different word embeddings, vectorization schemes, and model architectures affect performance on the Enron-Spam and TREC 2007 datasets. The primary aim is to show how the meta-learner's combined predictions stack up against individual ML and DL approaches.

Our meta-learner outperforms all state-of-the-art models, achieving an accuracy of 0.9905 and an AUC score of 0.9991 on a hybrid dataset that combines Enron-Spam and TREC 2007. To the best of our knowledge, our model also surpasses the only other meta-learning-based spam detection model reported in recent literature, with higher accuracy, better generalization from a significantly larger dataset, and lower computational complexity. We also evaluated our meta-learner in a zero-shot setting on an unseen real-world dataset, achieving a spam sensitivity rate of 0.8970 and an AUC score of 0.7605.

These results demonstrate that meta-learning can yield more robust, bias-resistant spam filters suited for real-world deployment. By combining complementary model strengths, the meta-learner also offers improved resilience against evolving spam tactics.

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}, F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}
- **Diseases:** DL (MESH:D007859)
- **Chemicals:** Word2Vec (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12583181/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12583181/full.md

## References

94 references — full list in the complete paper: https://tomesphere.com/paper/PMC12583181/full.md

---
Source: https://tomesphere.com/paper/PMC12583181