Counterfactual reasoning: Do language models need world knowledge for causal understanding?
Jiaxuan Li, Lang Yu, Allyson Ettinger

TL;DR
This paper investigates whether pre-trained language models can understand causal reasoning through counterfactuals, revealing that most models rely heavily on lexical cues and only GPT-3 shows nuanced understanding.
Contribution
The study introduces psycholinguistic-inspired tests and datasets to evaluate counterfactual reasoning, highlighting the reliance on lexical cues and the limited world knowledge integration in models.
Findings
Models can override real-world knowledge in counterfactuals.
Stronger world knowledge correlates with more robust counterfactual reasoning.
GPT-3 uniquely shows sensitivity to linguistic nuances of counterfactuals.
Abstract
Current pre-trained language models have enabled remarkable improvements in downstream tasks, but it remains difficult to distinguish effects of statistical correlation from more systematic logical reasoning grounded on understanding of the real world. In this paper we tease these factors apart by leveraging counterfactual conditionals, which force language models to predict unusual consequences based on hypothetical propositions. We introduce a set of tests drawn from psycholinguistic experiments, as well as larger-scale controlled datasets, to probe counterfactual predictions from a variety of popular pre-trained language models. We find that models are consistently able to override real-world knowledge in counterfactual scenarios, and that this effect is more robust in case of stronger baseline world knowledge -- however, we also find that for most models this effect appears largely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
MethodsMulti-Head Attention · Attention Is All You Need · Test · Cosine Annealing · Linear Warmup With Cosine Annealing · Adam · Softmax · Layer Normalization · {Dispute@FaQ-s}How to file a dispute with Expedia? · Dropout
