SpaML: a Bimodal Ensemble Learning Spam Detector based on NLP Techniques

Jaouhar Fattahi; Mohamed Mejri

arXiv:2010.07444·cs.CR·January 1, 2021

SpaML: a Bimodal Ensemble Learning Spam Detector based on NLP Techniques

Jaouhar Fattahi, Mohamed Mejri

PDF

TL;DR

SpaML is a novel ensemble learning spam detection tool that leverages NLP techniques like BoW and TF-IDF with multiple classifiers to improve accuracy and precision.

Contribution

The paper introduces SpaML, combining supervised and unsupervised classifiers with NLP techniques in an ensemble framework for enhanced spam detection.

Findings

01

SpaML achieves high accuracy in spam detection.

02

The ensemble approach outperforms individual classifiers.

03

NLP techniques significantly improve detection performance.

Abstract

In this paper, we put forward a new tool, called SpaML, for spam detection using a set of supervised and unsupervised classifiers, and two techniques imbued with Natural Language Processing (NLP), namely Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF). We first present the NLP techniques used. Then, we present our classifiers and their performance on each of these techniques. Then, we present our overall Ensemble Learning classifier and the strategy we are using to combine them. Finally, we present the interesting results shown by SpaML in terms of accuracy and precision.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.