Entropy-based Classification of 'Retweeting' Activity on Twitter

Rumi Ghosh; Tawan Surachawala; Kristina Lerman

arXiv:1106.0346·cs.SI·June 3, 2011·61 cites

Entropy-based Classification of 'Retweeting' Activity on Twitter

Rumi Ghosh, Tawan Surachawala, Kristina Lerman

PDF

Open Access

TL;DR

This paper introduces an entropy-based method to classify Twitter retweeting activities, effectively distinguishing between various user behaviors such as spam, news sharing, and promotional campaigns using only two features.

Contribution

It presents a novel, scalable, and robust information-theoretic approach for classifying Twitter retweeting activities based on time-interval and user entropy features.

Findings

01

Successfully categorized five distinct retweeting activities

02

Achieved high accuracy in activity separation using minimal features

03

Demonstrated method's robustness to sampling and missing data

Abstract

Twitter is used for a variety of reasons, including information dissemination, marketing, political organizing and to spread propaganda, spamming, promotion, conversations, and so on. Characterizing these activities and categorizing associated user generated content is a challenging task. We present a information-theoretic approach to classification of user activity on Twitter. We focus on tweets that contain embedded URLs and study their collective `retweeting' dynamics. We identify two features, time-interval and user entropy, which we use to classify retweeting activity. We achieve good separation of different activities using just these two features and are able to categorize content based on the collective user response it generates. We have identified five distinct categories of retweeting activity on Twitter: automatic/robotic activity, newsworthy information dissemination,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Misinformation and Its Impacts · Complex Network Analysis Techniques