MM-COVID: A Multilingual and Multimodal Data Repository for Combating   COVID-19 Disinformation

Yichuan Li; Bohan Jiang; Kai Shu; Huan Liu

arXiv:2011.04088·cs.SI·November 24, 2020·52 cites

MM-COVID: A Multilingual and Multimodal Data Repository for Combating COVID-19 Disinformation

Yichuan Li, Bohan Jiang, Kai Shu, Huan Liu

PDF

Open Access 3 Repos

TL;DR

This paper introduces MM-COVID, a multilingual and multimodal dataset of COVID-19 fake news and trustworthy information across six languages, aimed at improving fake news detection and analysis.

Contribution

The creation of a comprehensive multilingual fake news dataset with social context for COVID-19, facilitating research in multilingual fake news detection and social media analysis.

Findings

01

The dataset includes 3981 fake news and 7192 trustworthy items across six languages.

02

Detailed analysis of the dataset reveals linguistic and social patterns in COVID-19 misinformation.

03

Demonstrates the dataset's utility in various fake news detection applications.

Abstract

The COVID-19 epidemic is considered as the global health crisis of the whole society and the greatest challenge mankind faced since World War Two. Unfortunately, the fake news about COVID-19 is spreading as fast as the virus itself. The incorrect health measurements, anxiety, and hate speeches will have bad consequences on people's physical health, as well as their mental health in the whole world. To help better combat the COVID-19 fake news, we propose a new fake news detection dataset MM-COVID(Multilingual and Multidimensional COVID-19 Fake News Data Repository). This dataset provides the multilingual fake news and the relevant social context. We collect 3981 pieces of fake news content and 7192 trustworthy information from English, Spanish, Portuguese, Hindi, French and Italian, 6 different languages. We present a detailed and exploratory analysis of MM-COVID from different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Hate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining