Making the Most of Tweet-Inherent Features for Social Spam Detection on   Twitter

Bo Wang; Arkaitz Zubiaga; Maria Liakata; Rob Procter

arXiv:1503.07405·cs.IR·March 26, 2015·82 cites

Making the Most of Tweet-Inherent Features for Social Spam Detection on Twitter

Bo Wang, Arkaitz Zubiaga, Maria Liakata, Rob Procter

PDF

Open Access

TL;DR

This paper presents a method for detecting social spam on Twitter using only tweet-inherent features, enabling faster and more scalable spam detection suitable for real-time applications.

Contribution

It introduces a novel approach focusing solely on tweet-inherent features for spam detection, demonstrating competitive results without relying on extensive user data.

Findings

01

Achieved high detection accuracy with limited tweet features

02

Identified effective classifiers and feature sets for social spam detection

03

Demonstrated generalizability across different datasets

Abstract

Social spam produces a great amount of noise on social media services such as Twitter, which reduces the signal-to-noise ratio that both end users and data mining applications observe. Existing techniques on social spam detection have focused primarily on the identification of spam accounts by using extensive historical and network-based data. In this paper we focus on the detection of spam tweets, which optimises the amount of data that needs to be gathered by relying only on tweet-inherent features. This enables the application of the spam detection system to a large set of tweets in a timely fashion, potentially applicable in a real-time or near real-time setting. Using two large hand-labelled datasets of tweets containing spam, we study the suitability of five classification algorithms and four different feature sets to the social spam detection task. Our results show that, by using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Sentiment Analysis and Opinion Mining · Text and Document Classification Technologies