Causal Cartographer: From Mapping to Reasoning Over Counterfactual Worlds

Ga\"el Gendron; Jo\v{z}e M. Ro\v{z}anec; Michael Witbrock; Gillian Dobbie

arXiv:2505.14396·cs.AI·May 21, 2025

Causal Cartographer: From Mapping to Reasoning Over Counterfactual Worlds

Ga\"el Gendron, Jo\v{z}e M. Ro\v{z}anec, Michael Witbrock, Gillian Dobbie

PDF

Open Access 1 Repo 3 Reviews

TL;DR

The paper introduces the Causal Cartographer framework, which extracts causal knowledge and enhances large language models' ability to perform reliable counterfactual reasoning by constructing causal networks and constrained inference agents.

Contribution

It presents a novel framework combining causal relationship extraction with reasoning agents to improve counterfactual reasoning in large language models.

Findings

01

Extracted large networks of real-world causal relationships.

02

Improved robustness of LLMs in causal reasoning tasks.

03

Reduced inference costs and spurious correlations.

Abstract

Causal world models are systems that can answer counterfactual questions about an environment of interest, i.e. predict how it would have evolved if an arbitrary subset of events had been realized differently. It requires understanding the underlying causes behind chains of events and conducting causal inference for arbitrary unseen distributions. So far, this task eludes foundation models, notably large language models (LLMs), which do not have demonstrated causal reasoning capabilities beyond the memorization of existing causal relationships. Furthermore, evaluating counterfactuals in real-world applications is challenging since only the factual world is observed, limiting evaluation to synthetic datasets. We address these problems by explicitly extracting and modeling causal relationships and propose the Causal Cartographer framework. First, we introduce a graph retrieval-augmented…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

The paper addresses an important gap between abstract causal reasoning and real-world data extraction. Its proposed combination of causal extraction and counterfactual reasoning within an LLM framework is both ambitious and well-motivated. The introduction of CausalWorld, a large-scale, structured repository of 975 nodes and 1337 causal relations, is an impressive resource that could stimulate further research. The integration of Graph-RAG retrieval ensures grounding in prior causal context duri

Weaknesses

Despite its strengths, the paper has several limitations that hinder its maturity for a top-tier conference. The evaluation is limited in scope and realism: the CausalWorld-CR dataset is constructed via synthetic matching across news articles rather than ground-truth counterfactual data. This raises concerns about the validity of “real-world” claims and the soundness of the evaluation metric.

Reviewer 02Rating 6Confidence 3

Strengths

- The paper argues well for why explicit causal constraints can mitigate spurious correlations and reduce inference cost. - The two-agent split via decomposition of the task as extraction and reasoning enables each agent to focus specifically on its own task. - The causal-blanket definition and K-Matching Equivalence theorem formalize when matched worlds yield valid counterfactual targets—useful for this emerging evaluation paradigm. - The reported token/input reductions and output length shrink

Weaknesses

- The text corpus utilized is 2020 news with focus on economics. What factors led to this choice? How well does this approach perform in other domains? - The method leans on SCM framing (DAGs), yet the constructed CausalWorld allows cycles/feedback loops (Fig. 6). - Causal blankets are defined as fully determining the target (deterministic f). Real news variables are often noisy. Can the theorem and agent be generalized to stochastic blankets?

Reviewer 03Rating 4Confidence 4

Strengths

1. Understanding LLM performance on counterfactual reasoning tasks is crucial is furthering reasearch on LLMs ability to do causal tasks 2. Using real-world data instead of synthetic is encouraging 3. The proposed method is more interpretable which is good for future research

Weaknesses

1. It is not clear how this method can scale to production LLM systems. 2. Causal graph building would require a very careful control so as to not introduce bias

Code & Models

Repositories

ggendro/causal-cartographer
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Geographic Information Systems Studies

MethodsCounterfactuals Explanations · Causal inference