Discovering and Reasoning of Causality in the Hidden World with Large Language Models

Chenxi Liu; Yongqiang Chen; Tongliang Liu; Mingming Gong; James Cheng; Bo Han; Kun Zhang

arXiv:2402.03941·cs.LG·October 14, 2025·2 cites

Discovering and Reasoning of Causality in the Hidden World with Large Language Models

Chenxi Liu, Yongqiang Chen, Tongliang Liu, Mingming Gong, James Cheng, Bo Han, Kun Zhang

PDF

Open Access

TL;DR

This paper introduces COAT, a framework that leverages large language models to automatically suggest hidden variables and uncover causal structures from unstructured data, enhancing causal discovery in real-world scenarios.

Contribution

The paper presents a novel LLM-based framework, COAT, for automated causal variable suggestion and structure discovery, with theoretical guarantees and practical validation.

Findings

01

COAT effectively uncovers hidden causal variables from unstructured data.

02

The framework achieves reliable causal discovery with theoretical guarantees.

03

Empirical results demonstrate COAT's efficiency on benchmarks and real-world data.

Abstract

Revealing hidden causal variables alongside the underlying causal mechanisms is essential to the development of science. Despite the progress in the past decades, existing practice in causal discovery (CD) heavily relies on high-quality measured variables, which are usually given by human experts. In fact, the lack of well-defined high-level variables behind unstructured data has been a longstanding roadblock to a broader real-world application of CD. This procedure can naturally benefit from an automated process that can suggest potential hidden variables in the system. Interestingly, Large language models (LLMs) are trained on massive observations of the world and have demonstrated great capability in processing unstructured data. To leverage the power of LLMs, we develop a new framework termed Causal representatiOn AssistanT (COAT) that incorporates the rich world knowledge of LLMs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling