Investigating Zero- and Few-shot Generalization in Fact Verification
Liangming Pan, Yunxiang Zhang, Min-Yen Kan

TL;DR
This paper investigates the challenges of zero- and few-shot generalization in fact verification across multiple domains, highlighting current limitations and proposing methods to improve transferability.
Contribution
It introduces a comprehensive benchmark dataset collection for FV across diverse domains and analyzes factors affecting model generalization, proposing domain knowledge pretraining and claim generation as solutions.
Findings
Current models generalize poorly across domains.
Dataset size, evidence length, and claim type influence generalization.
Pretraining on specialized domains and claim generation improve transferability.
Abstract
In this paper, we explore zero- and few-shot generalization for fact verification (FV), which aims to generalize the FV model trained on well-resourced domains (e.g., Wikipedia) to low-resourced domains that lack human annotations. To this end, we first construct a benchmark dataset collection which contains 11 FV datasets representing 6 domains. We conduct an empirical analysis of generalization across these FV datasets, finding that current models generalize poorly. Our analysis reveals that several factors affect generalization, including dataset size, length of evidence, and the type of claims. Finally, we show that two directions of work improve generalization: 1) incorporating domain knowledge via pretraining on specialized domains, and 2) automatically generating training data via claim generation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
