Misinformation Has High Perplexity
Nayeon Lee, Yejin Bang, Andrea Madotto, Pascale Fung

TL;DR
This paper proposes an unsupervised method for debunking misinformation by leveraging the higher perplexity of false claims, using evidence extraction and language model perplexity scoring, especially effective for COVID-19 related claims.
Contribution
The paper introduces a novel unsupervised approach to misinformation debunking based on perplexity, along with new COVID-19 datasets for evaluation.
Findings
Our system outperforms existing methods on COVID-19 datasets.
Perplexity effectively distinguishes false claims from true ones.
New datasets facilitate research in misinformation debunking.
Abstract
Debunking misinformation is an important and time-critical task as there could be adverse consequences when misinformation is not quashed promptly. However, the usual supervised approach to debunking via misinformation classification requires human-annotated data and is not suited to the fast time-frame of newly emerging events such as the COVID-19 outbreak. In this paper, we postulate that misinformation itself has higher perplexity compared to truthful statements, and propose to leverage the perplexity to debunk false claims in an unsupervised manner. First, we extract reliable evidence from scientific and news sources according to sentence similarity to the claims. Second, we prime a language model with the extracted evidence and finally evaluate the correctness of given claims based on the perplexity scores at debunking time. We construct two new COVID-19-related test sets, one is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Topic Modeling · Sentiment Analysis and Opinion Mining
