Check Mate: Prioritizing User Generated Multi-Media Content for Fact-Checking
Tarunima Prabhakar, Anushree Gupta, Kruttika Nadig, Denny George

TL;DR
This paper introduces a new dataset for prioritizing user-generated multi-media social media posts in Hindi for fact-checking, addressing the gap in multilingual and multi-modal misinformation detection.
Contribution
It presents a novel dataset focusing on Hindi user-generated multi-media content, including metadata, to aid in prioritizing posts for fact-checking, expanding research beyond English news articles.
Findings
Dataset includes multi-modal social media posts with metadata
Enables analysis of virality and misinformation correlation
Supports development of fact-checking prioritization tools
Abstract
Volume of content and misinformation on social media is rapidly increasing. There is a need for systems that can support fact checkers by prioritizing content that needs to be fact checked. Prior research on prioritizing content for fact-checking has focused on news media articles, predominantly in English language. Increasingly, misinformation is found in user-generated content. In this paper we present a novel dataset that can be used to prioritize check-worthy posts from multi-media content in Hindi. It is unique in its 1) focus on user generated content, 2) language and 3) accommodation of multi-modality in social media posts. In addition, we also provide metadata for each post such as number of shares and likes of the post on ShareChat, a popular Indian social media platform, that allows for correlative analysis around virality and misinformation. The data is accessible on Zenodo…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Hate Speech and Cyberbullying Detection
