Arrow: A Foundation Model for Causal Discovery

Ryan Thompson; He Zhao; Daniel M. Steinberg; Edwin V. Bonilla

arXiv:2605.07204·cs.LG·May 11, 2026

Arrow: A Foundation Model for Causal Discovery

Ryan Thompson, He Zhao, Daniel M. Steinberg, Edwin V. Bonilla

PDF

TL;DR

Arrow is a transformer-based foundation model that enables fast, accurate zero-shot causal discovery on observational data by leveraging synthetic training across diverse graph types.

Contribution

It introduces a novel graph factorization and training approach that allows zero-shot causal discovery with high accuracy and efficiency.

Findings

01

Matches or outperforms existing methods on various datasets.

02

Operates with substantially lower inference cost.

03

Demonstrates effective zero-shot transfer from synthetic training.

Abstract

We introduce Arrow, a foundation model for zero-shot causal discovery on observational tabular data. Arrow factorizes a directed acyclic graph into an undirected skeleton and a topological order, guaranteeing acyclicity by construction. Given a new dataset, it uses a transformer-based architecture to contextualize variables within and across observations, then predicts skeleton edge probabilities and node order scores that together define a graph. Arrow is trained in a supervised fashion on synthetic datasets with ground-truth graphs, using an end-to-end differentiable directed edge composite likelihood induced by the skeleton-order factorization. The training distribution spans diverse graph families, functional forms, noise models, and dataset shapes. Across in- and out-of-distribution synthetic, semi-synthetic, and real datasets, Arrow matches or outperforms existing causal discovery…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.