Causal Reasoning and Large Language Models: Opening a New Frontier for   Causality

Emre K{\i}c{\i}man; Robert Ness; Amit Sharma; Chenhao Tan

arXiv:2305.00050·cs.AI·August 21, 2024·92 cites

Causal Reasoning and Large Language Models: Opening a New Frontier for Causality

Emre K{\i}c{\i}man, Robert Ness, Amit Sharma, Chenhao Tan

PDF

Open Access 1 Repo

TL;DR

This paper benchmarks large language models' ability to generate causal reasoning, showing they outperform existing methods in various tasks and can assist human experts in causal analysis, despite some unpredictable failures.

Contribution

It demonstrates that LLMs can effectively perform causal reasoning tasks, surpassing prior algorithms, and explores their potential to aid in causal analysis across domains.

Findings

01

LLMs outperform existing algorithms in causal discovery and reasoning tasks

02

LLMs generalize well to datasets created after training cutoff

03

LLMs exhibit some unpredictable failure modes

Abstract

The causal capabilities of large language models (LLMs) are a matter of significant debate, with critical implications for the use of LLMs in societally impactful domains such as medicine, science, law, and policy. We conduct a "behavorial" study of LLMs to benchmark their capability in generating causal arguments. Across a wide range of tasks, we find that LLMs can generate text corresponding to correct causal arguments with high probability, surpassing the best-performing existing methods. Algorithms based on GPT-3.5 and 4 outperform existing algorithms on a pairwise causal discovery task (97%, 13 points gain), counterfactual reasoning task (92%, 20 points gain) and event causality (86% accuracy in determining necessary and sufficient causes in vignettes). We perform robustness checks across tasks and show that the capabilities cannot be explained by dataset memorization alone,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

py-why/pywhy-llm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsMulti-Head Attention · Attention Is All You Need · Cosine Annealing · Adam · Layer Normalization · Linear Layer · Dropout · Byte Pair Encoding · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia?