CSQL: Mapping Documents into Causal Databases
Sridhar Mahadevan

TL;DR
CSQL is a system that automatically converts unstructured text documents into a structured causal database, enabling complex causal queries and analysis across large document collections in various domains.
Contribution
The paper introduces CSQL, a novel system that transforms unstructured documents into causal databases supporting causal analysis, surpassing previous approaches like RAG or knowledge graphs.
Findings
Successfully converts documents into causal databases.
Enables complex causal queries over large corpora.
Demonstrates application in economics and other fields.
Abstract
We describe a novel system, CSQL, which automatically converts a collection of unstructured text documents into an SQL-queryable causal database (CDB). A CDB differs from a traditional DB: it is designed to answer "why'' questions via causal interventions and structured causal queries. CSQL builds on our earlier system, DEMOCRITUS, which converts documents into thousands of local causal models derived from causal discourse. Unlike RAG-based systems or knowledge-graph based approaches, CSQL supports causal analysis over document collections rather than purely associative retrieval. For example, given an article on the origins of human bipedal walking, CSQL enables queries such as: "What are the strongest causal influences on bipedalism?'' or "Which variables act as causal hubs with the largest downstream influence?'' Beyond single-document case studies, we show that CSQL can also ingest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhilosophy and History of Science · Bayesian Modeling and Causal Inference · Logic, Reasoning, and Knowledge
