Investigating Zero- and Few-shot Generalization in Fact Verification

Liangming Pan; Yunxiang Zhang; Min-Yen Kan

arXiv:2309.09444·cs.CL·September 19, 2023

Investigating Zero- and Few-shot Generalization in Fact Verification

Liangming Pan, Yunxiang Zhang, Min-Yen Kan

PDF

Open Access 1 Repo

TL;DR

This paper investigates the challenges of zero- and few-shot generalization in fact verification across multiple domains, highlighting current limitations and proposing methods to improve transferability.

Contribution

It introduces a comprehensive benchmark dataset collection for FV across diverse domains and analyzes factors affecting model generalization, proposing domain knowledge pretraining and claim generation as solutions.

Findings

01

Current models generalize poorly across domains.

02

Dataset size, evidence length, and claim type influence generalization.

03

Pretraining on specialized domains and claim generation improve transferability.

Abstract

In this paper, we explore zero- and few-shot generalization for fact verification (FV), which aims to generalize the FV model trained on well-resourced domains (e.g., Wikipedia) to low-resourced domains that lack human annotations. To this end, we first construct a benchmark dataset collection which contains 11 FV datasets representing 6 domains. We conduct an empirical analysis of generalization across these FV datasets, finding that current models generalize poorly. Our analysis reveals that several factors affect generalization, including dataset size, length of evidence, and the type of claims. Finally, we show that two directions of work improve generalization: 1) incorporating domain knowledge via pretraining on specialized domains, and 2) automatically generating training data via claim generation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

teacherpeterpan/fact-checking-generalization
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies