Large Language Models for Constrained-Based Causal Discovery
Kai-Hendrik Cohrs, Gherardo Varando, Emiliano Diaz, Vasileios, Sitokonstantinou, Gustau Camps-Valls

TL;DR
This paper investigates using Large Language Models as an alternative to domain experts for causal graph construction, employing prompts for conditional independence queries and enhancing performance with a voting schema.
Contribution
It introduces a novel LLM-based approach for causal discovery, framing independence queries as prompts and improving accuracy with a voting mechanism.
Findings
LLMs can perform causal reasoning with variable accuracy.
A voting schema improves false-positive and false-negative control.
Evidence suggests LLMs could complement data-driven causal discovery methods.
Abstract
Causality is essential for understanding complex systems, such as the economy, the brain, and the climate. Constructing causal graphs often relies on either data-driven or expert-driven approaches, both fraught with challenges. The former methods, like the celebrated PC algorithm, face issues with data requirements and assumptions of causal sufficiency, while the latter demand substantial time and domain knowledge. This work explores the capabilities of Large Language Models (LLMs) as an alternative to domain experts for causal graph generation. We frame conditional independence queries as prompts to LLMs and employ the PC algorithm with the answers. The performance of the LLM-based conditional independence oracle on systems with known causal graphs shows a high degree of variability. We improve the performance through a proposed statistical-inspired voting schema that allows some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Software Engineering Research · Data Quality and Management
