Counterfactual reasoning: Do language models need world knowledge for   causal understanding?

Jiaxuan Li; Lang Yu; Allyson Ettinger

arXiv:2212.03278·cs.CL·December 8, 2022

Counterfactual reasoning: Do language models need world knowledge for causal understanding?

Jiaxuan Li, Lang Yu, Allyson Ettinger

PDF

Open Access 1 Repo

TL;DR

This paper investigates whether pre-trained language models can understand causal reasoning through counterfactuals, revealing that most models rely heavily on lexical cues and only GPT-3 shows nuanced understanding.

Contribution

The study introduces psycholinguistic-inspired tests and datasets to evaluate counterfactual reasoning, highlighting the reliance on lexical cues and the limited world knowledge integration in models.

Findings

01

Models can override real-world knowledge in counterfactuals.

02

Stronger world knowledge correlates with more robust counterfactual reasoning.

03

GPT-3 uniquely shows sensitivity to linguistic nuances of counterfactuals.

Abstract

Current pre-trained language models have enabled remarkable improvements in downstream tasks, but it remains difficult to distinguish effects of statistical correlation from more systematic logical reasoning grounded on understanding of the real world. In this paper we tease these factors apart by leveraging counterfactual conditionals, which force language models to predict unusual consequences based on hypothetical propositions. We introduce a set of tests drawn from psycholinguistic experiments, as well as larger-scale controlled datasets, to probe counterfactual predictions from a variety of popular pre-trained language models. We find that models are consistently able to override real-world knowledge in counterfactual scenarios, and that this effect is more robust in case of stronger baseline world knowledge -- however, we also find that for most models this effect appears largely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

goldengua/counterfactual_inference_lm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)

MethodsMulti-Head Attention · Attention Is All You Need · Test · Cosine Annealing · Linear Warmup With Cosine Annealing · Adam · Softmax · Layer Normalization · {Dispute@FaQ-s}How to file a dispute with Expedia? · Dropout