ClimateCause: Complex and Implicit Causal Structures in Climate Reports
Liesbeth Allein, Nataly Pineda-Casta\~neda, Andrea Rocci, Marie-Francine Moens

TL;DR
ClimateCause is a new expert-annotated dataset capturing complex, implicit, and nested causal structures in climate reports, aiding causal reasoning and benchmarking language models.
Contribution
It introduces ClimateCause, a dataset with detailed annotations of higher-order causal relations in climate science reports, addressing gaps in existing causal datasets.
Findings
ClimateCause enables better quantification of causal complexity in climate reports.
Benchmarking shows causal chain reasoning remains a significant challenge for language models.
The dataset supports analysis of readability based on causal graph complexity.
Abstract
Understanding climate change requires reasoning over complex causal networks. Yet, existing causal discovery datasets predominantly capture explicit, direct causal relations. We introduce ClimateCause, a manually expert-annotated dataset of higher-order causal structures from science-for-policy climate reports, including implicit and nested causality. Cause-effect expressions are normalized and disentangled into individual causal relations to facilitate graph construction, with unique annotations for cause-effect correlation, relation type, and spatiotemporal context. We further demonstrate ClimateCause's value for quantifying readability based on the semantic complexity of causal graphs underlying a statement. Finally, large language model benchmarking on correlation inference and causal chain reasoning highlights the latter as a key challenge.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
