Structured Thinking Matters: Improving LLMs Generalization in Causal Inference Tasks

Wentao Sun; Jo\~ao Paulo Nogueira; and Alonso Silva

arXiv:2505.18034·cs.AI·May 28, 2025

Structured Thinking Matters: Improving LLMs Generalization in Causal Inference Tasks

Wentao Sun, Jo\~ao Paulo Nogueira, and Alonso Silva

PDF

Open Access

TL;DR

This paper introduces a structured knowledge graph approach to improve large language models' ability to distinguish causation from correlation, significantly enhancing their performance on causal inference benchmarks.

Contribution

The paper proposes a novel method that guides LLMs to build structured knowledge graphs, improving causal reasoning beyond traditional prompting techniques.

Findings

01

F1 score improved from 32.71 to 48.26 on Corr2Cause benchmark

02

Significant gains in precision and recall observed

03

Method demonstrates potential for broader causal inference tasks

Abstract

Despite remarkable advances in the field, LLMs remain unreliable in distinguishing causation from correlation. Recent results from the Corr2Cause dataset benchmark reveal that state-of-the-art LLMs -- such as GPT-4 (F1 score: 29.08) -- only marginally outperform random baselines (Random Uniform, F1 score: 20.38), indicating limited capacity of generalization. To tackle this limitation, we propose a novel structured approach: rather than directly answering causal queries, we provide the model with the capability to structure its thinking by guiding the model to build a structured knowledge graph, systematically encoding the provided correlational premises, to answer the causal queries. This intermediate representation significantly enhances the model's causal capabilities. Experiments on the test subset of the Corr2Cause dataset benchmark with Qwen3-32B model (reasoning model) show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Data Quality and Management · AI-based Problem Solving and Planning