Robust Information Retrieval for False Claims with Distracting Entities   In Fact Extraction and Verification

Mingwen Dong; Christos Christodoulopoulos; Sheng-Min Shih; Xiaofei Ma

arXiv:2112.07618·cs.IR·December 15, 2021·1 cites

Robust Information Retrieval for False Claims with Distracting Entities In Fact Extraction and Verification

Mingwen Dong, Christos Christodoulopoulos, Sheng-Min Shih, Xiaofei Ma

PDF

Open Access

TL;DR

This paper investigates the challenges false claims pose to evidence retrieval in fact checking, revealing that irrelevant entities in false claims hinder retrieval accuracy, and proposes data augmentation and model ensemble techniques to improve robustness.

Contribution

It identifies the impact of irrelevant entities in false claims on retrieval performance and introduces data augmentation and model ensemble methods to enhance robustness.

Findings

01

Retrieval models perform worse on false claims with irrelevant entities.

02

Data augmentation with synthetic false claims improves recall.

03

Model ensemble strategies increase evidence retrieval accuracy.

Abstract

Accurate evidence retrieval is essential for automated fact checking. Little previous research has focused on the differences between true and false claims and how they affect evidence retrieval. This paper shows that, compared with true claims, false claims more frequently contain irrelevant entities which can distract evidence retrieval model. A BERT-based retrieval model made more mistakes in retrieving refuting evidence for false claims than supporting evidence for true claims. When tested with adversarial false claims (synthetically generated) containing irrelevant entities, the recall of the retrieval model is significantly lower than that for original claims. These results suggest that the vanilla BERT-based retrieval model is not robust to irrelevant entities in the false claims. By augmenting the training data with synthetic false claims containing irrelevant entities, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Explainable Artificial Intelligence (XAI)