Nuance Matters: Probing Epistemic Consistency in Causal Reasoning
Shaobo Cui, Junyou Li, Luca Mouchel, Yiyang Feng, Boi Faltings

TL;DR
This paper introduces the concept of causal epistemic consistency and proposes new metrics to evaluate LLMs' ability to differentiate nuanced causal intermediates, revealing current models' limitations in maintaining this consistency.
Contribution
It presents a novel framework and metrics for assessing LLMs' self-consistency in causal reasoning involving nuanced intermediates, backed by extensive empirical analysis.
Findings
Current LLMs struggle with epistemic consistency in causal reasoning.
Proposed metrics effectively evaluate LLMs' causal reasoning consistency.
Internal token probabilities can aid in maintaining causal epistemic consistency.
Abstract
To address this gap, our study introduces the concept of causal epistemic consistency, which focuses on the self-consistency of Large Language Models (LLMs) in differentiating intermediates with nuanced differences in causal reasoning. We propose a suite of novel metrics -- intensity ranking concordance, cross-group position agreement, and intra-group clustering -- to evaluate LLMs on this front. Through extensive empirical studies on 21 high-profile LLMs, including GPT-4, Claude3, and LLaMA3-70B, we have favoring evidence that current models struggle to maintain epistemic consistency in identifying the polarity and intensity of intermediates in causal reasoning. Additionally, we explore the potential of using internal token probabilities as an auxiliary tool to maintain causal epistemic consistency. In summary, our study bridges a critical gap in AI research by investigating the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEpistemology, Ethics, and Metaphysics · Logic, Reasoning, and Knowledge · Multi-Agent Systems and Negotiation
