Rumor Detection and Classification for Twitter Data

Sardar Hamidian; Mona T Diab

arXiv:1912.08926·cs.SI·December 20, 2019·65 cites

Rumor Detection and Classification for Twitter Data

Sardar Hamidian, Mona T Diab

PDF

Open Access

TL;DR

This paper presents a two-step approach for detecting and classifying rumors on Twitter, utilizing novel features and preprocessing techniques, achieving high accuracy on standard datasets.

Contribution

It introduces a new methodology for rumor detection and classification on Twitter, including novel features and preprocessing strategies, with promising experimental results.

Findings

01

F-measure over 0.82 in mixed rumors dataset

02

84% accuracy in single rumor dataset

03

Effective feature grouping improves classification performance

Abstract

With the pervasiveness of online media data as a source of information verifying the validity of this information is becoming even more important yet quite challenging. Rumors spread a large quantity of misinformation on microblogs. In this study we address two common issues within the context of microblog social media. First we detect rumors as a type of misinformation propagation and next we go beyond detection to perform the task of rumor classification. WE explore the problem using a standard data set. We devise novel features and study their impact on the task. We experiment with various levels of preprocessing as a precursor of the classification as well as grouping of features. We achieve and f-measure of over 0.82 in RDC task in mixed rumors data set and 84 percent in a single rumor data set using a two-step classification approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Complex Network Analysis Techniques