Learning Causal Graphs at Scale: A Foundation Model Approach
Naiyu Yin, Tian Gao, Yue Yu

TL;DR
This paper introduces ADAG, a foundation model based on attention mechanisms, that efficiently learns multiple causal DAGs across tasks, improving accuracy and inference in small-sample regimes.
Contribution
It develops the first foundation model approach for DAG learning, leveraging attention mechanisms to capture shared structures across tasks and improve small-sample performance.
Findings
ADAG outperforms existing methods in DAG learning accuracy.
ADAG enables zero-shot inference for new tasks.
The approach reduces computational costs in DAG discovery.
Abstract
Due to its human-interpretability and invariance properties, Directed Acyclic Graph (DAG) has been a foundational tool across various areas of AI research, leading to significant advancements. However, DAG learning remains highly challenging, due to its super-exponential growth in computational cost and identifiability issues, particularly in small-sample regimes. To address these two challenges, in this work we leverage the recent success of linear transformers and develop a foundation model approach for discovering multiple order-consistent DAGs across tasks. In particular, we propose Attention-DAG (ADAG), a novel attention-mechanism-based architecture for learning multiple linear Structural Equation Models (SEMs). ADAG learns the mapping from observed data to both graph structure and parameters via a nonlinear attention-based kernel, enabling efficient multi-task estimation of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
