Do LLMs Have the Generalization Ability in Conducting Causal Inference?
Chen Wang, Dongming Zhao, Bo Wang, Ruifang He, Yuexian Hou

TL;DR
This study evaluates the ability of large language models to generalize in causal inference tasks involving unseen phenomena, revealing strengths in simple tasks and challenges with complex or familiar-term-influenced questions.
Contribution
The paper introduces a novel benchmark framework for assessing LLMs' generalization in causal inference on unseen data and systematically evaluates five leading LLMs across multiple tasks.
Findings
LLMs perform well on simple causal path discovery and factual inference.
Performance drops on backdoor adjustment and complex counterfactual questions.
Familiar terms in phenomenon names can hinder generalization performance.
Abstract
In causal inference, generalization capability refers to the ability to conduct causal inference methods on new data to estimate the causal-effect between unknown phenomenon, which is crucial for expanding the boundaries of knowledge. Studies have evaluated the causal inference capabilities of Large Language Models (LLMs) concerning known phenomena, yet the generalization capabilities of LLMs concerning unseen phenomena remain unexplored. In this paper, we selected four tasks: Causal Path Discovery (CP), Backdoor Adjustment (BA), Factual Inference (FI), and Counterfactual Inference (CI) as representatives of causal inference tasks. To generate evaluation questions about previously unseen phenomena in new data on the four tasks, we propose a benchmark generation framework, which employs randomly generated graphs and node names to formulate questions within hypothetical new causal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Bayesian Modeling and Causal Inference
MethodsCausal inference
