Arrow: A Foundation Model for Causal Discovery
Ryan Thompson, He Zhao, Daniel M. Steinberg, Edwin V. Bonilla

TL;DR
Arrow is a transformer-based foundation model that enables fast, accurate zero-shot causal discovery on observational data by leveraging synthetic training across diverse graph types.
Contribution
It introduces a novel graph factorization and training approach that allows zero-shot causal discovery with high accuracy and efficiency.
Findings
Matches or outperforms existing methods on various datasets.
Operates with substantially lower inference cost.
Demonstrates effective zero-shot transfer from synthetic training.
Abstract
We introduce Arrow, a foundation model for zero-shot causal discovery on observational tabular data. Arrow factorizes a directed acyclic graph into an undirected skeleton and a topological order, guaranteeing acyclicity by construction. Given a new dataset, it uses a transformer-based architecture to contextualize variables within and across observations, then predicts skeleton edge probabilities and node order scores that together define a graph. Arrow is trained in a supervised fashion on synthetic datasets with ground-truth graphs, using an end-to-end differentiable directed edge composite likelihood induced by the skeleton-order factorization. The training distribution spans diverse graph families, functional forms, noise models, and dataset shapes. Across in- and out-of-distribution synthetic, semi-synthetic, and real datasets, Arrow matches or outperforms existing causal discovery…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
