Large Causal Models from Large Language Models
Sridhar Mahadevan

TL;DR
This paper presents DEMOCRITUS, a system leveraging large language models to construct, organize, and visualize large-scale causal models across diverse domains, using novel categorical machine learning methods.
Contribution
It introduces a new paradigm for building large causal models from LLM-generated causal claims, integrating them into coherent relational structures with innovative methods.
Findings
Successfully built causal models in multiple domains
Analyzed the system's computational bottlenecks
Outlined future directions for scaling and extending DEMOCRITUS
Abstract
We introduce a new paradigm for building large causal models (LCMs) that exploits the enormous potential latent in today's large language models (LLMs). We describe our ongoing experiments with an implemented system called DEMOCRITUS (Decentralized Extraction of Manifold Ontologies of Causal Relations Integrating Topos Universal Slices) aimed at building, organizing, and visualizing LCMs that span disparate domains extracted from carefully targeted textual queries to LLMs. DEMOCRITUS is methodologically distinct from traditional narrow domain and hypothesis centered causal inference that builds causal models from experiments that produce numerical data. A high-quality LLM is used to propose topics, generate causal questions, and extract plausible causal statements from a diverse range of domains. The technical challenge is then to take these isolated, fragmented, potentially ambiguous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Topic Modeling · Computational and Text Analysis Methods
