LTCR: Long-Text Chinese Rumor Detection Dataset
Ziyang Ma, Mengsha Liu, Guian Fang, Ying Shen

TL;DR
This paper introduces LTCR, a new long-text Chinese rumor detection dataset, and proposes extmethod, a salience-aware model that achieves high accuracy in identifying fake news, especially in complex COVID-19 related misinformation.
Contribution
The paper provides a novel long-text Chinese rumor dataset and a salience-aware detection model that improves fake news identification accuracy.
Findings
Achieved 95.85% accuracy on the dataset
High fake news recall of 90.91%
F-score of 90.60% in detection
Abstract
False information can spread quickly on social media, negatively influencing the citizens' behaviors and responses to social events. To better detect all of the fake news, especially long texts which are harder to find completely, a Long-Text Chinese Rumor detection dataset named LTCR is proposed. The LTCR dataset provides a valuable resource for accurately detecting misinformation, especially in the context of complex fake news related to COVID-19. The dataset consists of 1,729 and 500 pieces of real and fake news, respectively. The average lengths of real and fake news are approximately 230 and 152 characters. We also propose \method, Salience-aware Fake News Detection Model, which achieves the highest accuracy (95.85%), fake news recall (90.91%) and F-score (90.60%) on the dataset. (https://github.com/Enderfga/DoubleCheck)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Topic Modeling · Spam and Phishing Detection
