CausalFlip: A Benchmark for LLM Causal Judgment Beyond Semantic Matching
Yuzhe Wang, Yaochen Zhu, Jundong Li

TL;DR
This paper introduces CausalFlip, a benchmark for evaluating and improving large language models' ability to perform true causal reasoning beyond semantic pattern matching, using specially designed questions and evaluation methods.
Contribution
The paper proposes CausalFlip, a novel causal reasoning benchmark with adversarial question pairs and noisy-prefix evaluation, and assesses different training paradigms to enhance causal reasoning in LLMs.
Findings
Explicit Chain-of-Thought can be misled by semantic correlations.
Internalized causal reasoning improves causal grounding.
Models trained with internalized reasoning outperform answer-only and explicit CoT methods.
Abstract
As large language models (LLMs) witness increasing deployment in complex, high-stakes decision-making scenarios, it becomes imperative to ground their reasoning in causality rather than spurious correlations. However, strong performance on traditional reasoning benchmarks does not guarantee true causal reasoning ability of LLMs, as high accuracy may still arise from memorizing semantic patterns instead of analyzing the underlying true causal structures. To bridge this critical gap, we propose a new causal reasoning benchmark, CausalFlip, designed to encourage the development of new LLM paradigm or training algorithms that ground LLM reasoning in causality rather than semantic correlation. CausalFlip consists of causal judgment questions built over event triples that could form different confounder, chain, and collider relations. Based on this, for each event triple, we construct pairs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
