Generalizing to the Future: Mitigating Entity Bias in Fake News   Detection

Yongchun Zhu; Qiang Sheng; Juan Cao; Shuokai Li; Danding Wang; Fuzhen; Zhuang

arXiv:2204.09484·cs.CL·April 21, 2022

Generalizing to the Future: Mitigating Entity Bias in Fake News Detection

Yongchun Zhu, Qiang Sheng, Juan Cao, Shuokai Li, Danding Wang, Fuzhen, Zhuang

PDF

1 Repo

TL;DR

This paper introduces ENDEF, a causal framework that mitigates entity bias in fake news detection models, significantly improving their ability to generalize to future, unseen data.

Contribution

It presents the first explicit approach to enhance fake news detection models' generalization to future data by removing entity bias through causal modeling.

Findings

01

Significant performance improvements on English and Chinese datasets.

02

Enhanced generalization ability demonstrated in online tests.

03

First work to explicitly address future data generalization in fake news detection.

Abstract

The wide dissemination of fake news is increasingly threatening both individuals and society. Fake news detection aims to train a model on the past news and detect fake news of the future. Though great efforts have been made, existing fake news detection methods overlooked the unintended entity bias in the real-world data, which seriously influences models' generalization ability to future data. For example, 97\% of news pieces in 2010-2017 containing the entity `Donald Trump' are real in our data, but the percentage falls down to merely 33\% in 2018. This would lead the model trained on the former set to hardly generalize to the latter, as it tends to predict news pieces about `Donald Trump' as real for lower training loss. In this paper, we propose an entity debiasing framework (\textbf{ENDEF}) which generalizes fake news detection models to the future data by mitigating entity bias…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ictmcg/endef-sigir2022
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsBalanced Selection