Exploring Factual Entailment with NLI: A News Media Study
Guy Mor-Lan, Effi Levi

TL;DR
This paper introduces FactRel, a new annotation scheme for factual entailment in news media, revealing that many factual pairs do not align with traditional NLI labels and demonstrating the potential of GPT-4 for improving classification performance.
Contribution
The paper presents FactRel, a novel dataset and annotation scheme for factual entailment, and explores the effectiveness of GPT-4 in enhancing model performance on this task.
Findings
84% of factually supporting pairs are not NLI entailments
GPT-4-generated synthetic data can improve classification accuracy
Few-shot GPT-4 achieves results comparable to trained medium LMs
Abstract
We explore the relationship between factuality and Natural Language Inference (NLI) by introducing FactRel -- a novel annotation scheme that models \textit{factual} rather than \textit{textual} entailment, and use it to annotate a dataset of naturally occurring sentences from news articles. Our analysis shows that 84\% of factually supporting pairs and 63\% of factually undermining pairs do not amount to NLI entailment or contradiction, respectively, suggesting that factual relationships are more apt for analyzing media discourse. We experiment with models for pairwise classification on the new dataset, and find that in some cases, generating synthetic data with GPT-4 on the basis of the annotated dataset can improve performance. Surprisingly, few-shot learning with GPT-4 yields strong results on par with medium LMs (DeBERTa) trained on the labelled dataset. We hypothesize that these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Digital and Cyber Forensics
MethodsAttention Is All You Need · Softmax · Layer Normalization · Absolute Position Encodings · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam · Linear Layer
