Correlated Feature Selection for Tweet Spam Classification

Prakamya Mishra

arXiv:1911.05495·cs.SI·October 27, 2020

Correlated Feature Selection for Tweet Spam Classification

Prakamya Mishra

PDF

Open Access

TL;DR

This paper presents a correlated feature selection method combined with neural network classification to effectively identify spam tweets on Twitter, achieving high accuracy and reducing training time.

Contribution

It introduces a correlation-based feature selection approach for spam detection and demonstrates the effectiveness of neural networks over traditional classifiers.

Findings

01

Neural networks achieved 97.57% accuracy in spam classification.

02

Correlated feature reduction improved training efficiency.

03

Neural network outperformed SVM, KNN, Naive Bayes, and Random Forest.

Abstract

The identification of spam messages on social networks is a very challenging task. Social media sites like Twitter \& Facebook attracts a lot of users and companies to advertise and attract users of personal gains. These advertisements most of the time leads to spamming, which in return leads to poor user experience. The purpose of this paper is to undertake the analysis of spamming on Twitter. To classify spams efficiently, it is necessary to first understand the features of the spam tweets as well as identify attributes of the spammer. We extract both tweet based features and user-based features for our analysis and observe the correlation between these features. This step is necessary as we can reduce the training time if we combine the highly correlated features. Our proposed approach uses a classification model based on artificial neural networks to classify the tweets as spam or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Network Security and Intrusion Detection · Sentiment Analysis and Opinion Mining

MethodsSupport Vector Machine