Learning Causal Graphs at Scale: A Foundation Model Approach

Naiyu Yin; Tian Gao; Yue Yu

arXiv:2506.18285·cs.LG·June 24, 2025

Learning Causal Graphs at Scale: A Foundation Model Approach

Naiyu Yin, Tian Gao, Yue Yu

PDF

TL;DR

This paper introduces ADAG, a foundation model based on attention mechanisms, that efficiently learns multiple causal DAGs across tasks, improving accuracy and inference in small-sample regimes.

Contribution

It develops the first foundation model approach for DAG learning, leveraging attention mechanisms to capture shared structures across tasks and improve small-sample performance.

Findings

01

ADAG outperforms existing methods in DAG learning accuracy.

02

ADAG enables zero-shot inference for new tasks.

03

The approach reduces computational costs in DAG discovery.

Abstract

Due to its human-interpretability and invariance properties, Directed Acyclic Graph (DAG) has been a foundational tool across various areas of AI research, leading to significant advancements. However, DAG learning remains highly challenging, due to its super-exponential growth in computational cost and identifiability issues, particularly in small-sample regimes. To address these two challenges, in this work we leverage the recent success of linear transformers and develop a foundation model approach for discovering multiple order-consistent DAGs across tasks. In particular, we propose Attention-DAG (ADAG), a novel attention-mechanism-based architecture for learning multiple linear Structural Equation Models (SEMs). ADAG learns the mapping from observed data to both graph structure and parameters via a nonlinear attention-based kernel, enabling efficient multi-task estimation of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.