Do LLMs Have the Generalization Ability in Conducting Causal Inference?

Chen Wang; Dongming Zhao; Bo Wang; Ruifang He; Yuexian Hou

arXiv:2410.11385·cs.CL·October 16, 2024

Do LLMs Have the Generalization Ability in Conducting Causal Inference?

Chen Wang, Dongming Zhao, Bo Wang, Ruifang He, Yuexian Hou

PDF

Open Access 1 Repo

TL;DR

This study evaluates the ability of large language models to generalize in causal inference tasks involving unseen phenomena, revealing strengths in simple tasks and challenges with complex or familiar-term-influenced questions.

Contribution

The paper introduces a novel benchmark framework for assessing LLMs' generalization in causal inference on unseen data and systematically evaluates five leading LLMs across multiple tasks.

Findings

01

LLMs perform well on simple causal path discovery and factual inference.

02

Performance drops on backdoor adjustment and complex counterfactual questions.

03

Familiar terms in phenomenon names can hinder generalization performance.

Abstract

In causal inference, generalization capability refers to the ability to conduct causal inference methods on new data to estimate the causal-effect between unknown phenomenon, which is crucial for expanding the boundaries of knowledge. Studies have evaluated the causal inference capabilities of Large Language Models (LLMs) concerning known phenomena, yet the generalization capabilities of LLMs concerning unseen phenomena remain unexplored. In this paper, we selected four tasks: Causal Path Discovery (CP), Backdoor Adjustment (BA), Factual Inference (FI), and Counterfactual Inference (CI) as representatives of causal inference tasks. To generate evaluation questions about previously unseen phenomena in new data on the four tasks, we propose a benchmark generation framework, which employs randomly generated graphs and node names to formulate questions within hypothetical new causal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

prayingsociety/ci_bench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Bayesian Modeling and Causal Inference

MethodsCausal inference