Causal Structure Learning Supervised by Large Language Model

Taiyu Ban; Lyuzhou Chen; Derui Lyu; Xiangyu Wang; Huanhuan; Chen

arXiv:2311.11689·cs.AI·November 21, 2023·6 cites

Causal Structure Learning Supervised by Large Language Model

Taiyu Ban, Lyuzhou Chen, Derui Lyu, Xiangyu Wang, Huanhuan, Chen

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces ILS-CSL, an iterative framework that leverages Large Language Models to improve causal structure learning from observational data, addressing previous limitations and demonstrating superior results across multiple datasets.

Contribution

The paper proposes a novel iterative LLM-supervised framework for causal structure learning that enhances robustness and efficiency over existing methods.

Findings

01

ILS-CSL outperforms previous CSL methods on eight real-world datasets.

02

The iterative approach refines causal graphs more accurately using LLM feedback.

03

The method generates high-quality structural constraints, improving causal inference.

Abstract

Causal discovery from observational data is pivotal for deciphering complex relationships. Causal Structure Learning (CSL), which focuses on deriving causal Directed Acyclic Graphs (DAGs) from data, faces challenges due to vast DAG spaces and data sparsity. The integration of Large Language Models (LLMs), recognized for their causal reasoning capabilities, offers a promising direction to enhance CSL by infusing it with knowledge-based causal inferences. However, existing approaches utilizing LLMs for CSL have encountered issues, including unreliable constraints from imperfect LLM inferences and the computational intensity of full pairwise variable analyses. In response, we introduce the Iterative LLM Supervised CSL (ILS-CSL) framework. ILS-CSL innovatively integrates LLM-based causal inference with CSL in an iterative process, refining the causal DAG using feedback from LLMs. This…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

1. The targeted problem is valid and important, timely Since the year 2023, there has been more evidence that LLM has a sort of commonsense causal knowledge, and it is very important to consider leveraging its power to enhance data-driven CSL for causal discovery. 2. The idea of using the data-driven approach as a driving force to let LLM decide the edge existence. I appreciate the efforts on combining the both in a smart way. commonly. This idea is novel and its effectiveness is demonstrated.

Weaknesses

See my detailed questions below

Reviewer 02Rating 3· reject, not good enoughConfidence 4

Strengths

The new idea of using LLM to provide directed edge (arc) information/constraints for CSL.

Weaknesses

1. The mathematics is not formal nor rigorous. The notations are not well defined or explained. Although we are familiar with causal inference and could more or less guess what these symbols mean, the notations are not so consistent through different papers and readers may not be able to guess correctly. And the mathematical description is in a style between strict math and pseudo-code-like language. So are the proofs in this paper. We strongly recommend the authors to re-write all mathematical

Reviewer 03Rating 3· reject, not good enoughConfidence 5

Strengths

The idea of using LLM to supervise the structural learning is interesting. A practical algorithm is proposed that integrates the LLM supervision into the structural learning algorithm. In particular, the specific prompt for discovering structural constraints is provided. Experiments are done using SoTA LLM.

Weaknesses

One issue of the proposed method is that it lacks theoretical guarantee. It is unclear whether the LLM supervision is accurate, as many LLMs suffer from hallucinations. Moreover, the analysis in Sec. 4 is vague. It seems that Lemma 1 is among the only theoretical discussions, but the presentation of this result is poor. The proof in the appendix is even harder to follow if not wrong. At least the theoretical problem is not clearly defined. As an experimental paper, the LLM used is not discussed,

Code & Models

Repositories

tymadara/ils-csl
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Topic Modeling · Advanced Graph Neural Networks

MethodsCausal inference · Circular Smooth Label