Efficient Annotator Reliability Assessment and Sample Weighting for Knowledge-Based Misinformation Detection on Social Media

Owen Cook; Charlie Grimshaw; Ben Wu; Sophie Dillon; Jack Hicks; Luke Jones; Thomas Smith; Matyas Szert; Xingyi Song

arXiv:2410.14515·cs.LG·July 29, 2025

Efficient Annotator Reliability Assessment and Sample Weighting for Knowledge-Based Misinformation Detection on Social Media

Owen Cook, Charlie Grimshaw, Ben Wu, Sophie Dillon, Jack Hicks, Luke Jones, Thomas Smith, Matyas Szert, Xingyi Song

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces EffiARA, a framework for assessing annotator reliability in knowledge-based misinformation detection on social media, demonstrating that sample weighting improves classification performance.

Contribution

It presents a novel annotation framework and a new dataset, and shows that reliability-based sample weighting enhances misinformation classification accuracy.

Findings

01

Sample weighting with annotator reliability yields best results.

02

Achieved macro-F1 scores of 0.757 with Llama-3.2-1B.

03

Developed and released the RUC-MCD dataset.

Abstract

Misinformation spreads rapidly on social media, confusing the truth and targeting potentially vulnerable people. To effectively mitigate the negative impact of misinformation, it must first be accurately detected before applying a mitigation strategy, such as X's community notes, which is currently a manual process. This study takes a knowledge-based approach to misinformation detection, modelling the problem similarly to one of natural language inference. The EffiARA annotation framework is introduced, aiming to utilise inter- and intra-annotator agreement to understand the reliability of each annotator and influence the training of large language models for classification based on annotator reliability. In assessing the EffiARA annotation framework, the Russo-Ukrainian Conflict Knowledge-Based Misinformation Classification Dataset (RUC-MCD) was developed and made publicly available.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Efficient Annotator Reliability Assessment and Sample Weighting for Knowledge-Based Misinformation Detection on Social Media· underline

Taxonomy

TopicsMisinformation and Its Impacts · Hate Speech and Cyberbullying Detection · Spam and Phishing Detection