Examining the Limitations of Computational Rumor Detection Models   Trained on Static Datasets

Yida Mu; Xingyi Song; Kalina Bontcheva; Nikolaos Aletras

arXiv:2309.11576·cs.CL·March 26, 2024·1 cites

Examining the Limitations of Computational Rumor Detection Models Trained on Static Datasets

Yida Mu, Xingyi Song, Kalina Bontcheva, Nikolaos Aletras

PDF

Open Access

TL;DR

This paper evaluates the limitations of static dataset-trained rumor detection models, highlighting the performance gap between content and context-based approaches, and offers strategies to mitigate temporal concept drift effects.

Contribution

It provides an in-depth empirical analysis of how context-based models underperform on unseen rumors and suggests practical methods to address dataset temporal biases.

Findings

01

Context-based models rely heavily on source post information.

02

Performance gap exists between content and context models on unseen rumors.

03

Strategies to reduce temporal concept drift improve model robustness.

Abstract

A crucial aspect of a rumor detection model is its ability to generalize, particularly its ability to detect emerging, previously unknown rumors. Past research has indicated that content-based (i.e., using solely source posts as input) rumor detection models tend to perform less effectively on unseen rumors. At the same time, the potential of context-based models remains largely untapped. The main contribution of this paper is in the in-depth evaluation of the performance gap between content and context-based models specifically on detecting new, unseen rumors. Our empirical findings demonstrate that context-based models are still overly dependent on the information derived from the rumors' source post and tend to overlook the significant role that contextual information can play. We also study the effect of data split strategies on classifier performance. Based on our experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Complex Network Analysis Techniques · Spam and Phishing Detection