Foundation Models for Causal Inference via Prior-Data Fitted Networks
Yuchen Ma, Dennis Frauen, Emil Javurek, Stefan Feuerriegel

TL;DR
This paper introduces CausalFM, a framework for training foundation models using prior-data fitted networks to perform Bayesian causal inference across various settings, demonstrating competitive in-context learning performance.
Contribution
It formalizes Bayesian priors for causal inference using structural causal models and develops causality-inspired Bayesian neural networks within the PFN framework.
Findings
CausalFM achieves competitive in-context learning performance.
The framework generalizes to multiple causal inference settings.
It offers a new paradigm for causal inference in various disciplines.
Abstract
Prior-data fitted networks (PFNs) have recently been proposed as a promising way to train tabular foundation models. PFNs are transformers that are pre-trained on synthetic data generated from a prespecified prior distribution and that enable Bayesian inference through in-context learning. In this paper, we introduce CausalFM, a comprehensive framework for training PFN-based foundation models in various causal inference settings. First, we formalize the construction of Bayesian priors for causal inference based on structural causal models (SCMs) in a principled way and derive necessary criteria for the validity of such priors. Building on this, we propose a novel family of prior distributions using causality-inspired Bayesian neural networks that enable CausalFM to perform Bayesian causal inference in various settings, including for back-door, front-door, and instrumental variable…
Peer Reviews
Decision·ICLR 2026 Poster
- The paper introduces a novel approach by leveraging prior-data fitted networks for causal inference, which might be the first to provide a comprehensive foundation model (CausalFM) covering multiple settings in one framework. Unlike past works that focus on a single identification strategy (e.g. only back-door criteria) or require task-specific models, CausalFM uses SCM-based priors to train a transformer that can flexibly handle back-door, front-door, and IV adjustments within one model. This
- While CausalFM is conceptually flexible, there may be practical scalability challenges. Training a PFN of this sort involves generating and learning from a very large number of synthetic datasets drawn from complex SCM priors, which is computationally intensive. The transformer model itself has a fixed context length and model size – this could limit the scale of datasets it can handle at test time (e.g. number of samples or covariates) unless the architecture is scaled up. The paper does not
Beyond dataset-specific causal inference to in-context causal inference is a promising and important research direction. In the paper, the authors propose a framework which is not restricted to a single causal setting. It supports back-door, front-door, and instrumental variable scenarios within a single model. The introduction of the "well-specified prior" concept and the argument for incorporating identifiability assumptions into the prior are valuable theoretical insights. The results on s
The pre-training methodology in this paper appears largely similar to exsiting work in this area The proof sketches are quite dense and lacks intuition, then are hard to follow. The 24-hour training time on an A100 GPU is mentioned, but there is no discussion of inference speed or comparative training costs of the baselines.
- The paper provides a clear theoretical argument for why PFN priors in this Bayesian approximation context should enforce identifiability for consistent estimation (I am a little skeptical if we should make this assumption, see below - but realizing and demonstrating it is very valuable). - Introduces a structured way to build priors over SCMs using BNNs that respect the assumed causal graph structure and identifiability conditions. - CausalFM is designed to handle back-door, front-door, and IV
- Reliance on Correct Identifiability Assumptions: The framework's core premise requires the user to correctly identify the true causal structure and select the appropriate identifiability strategy (back-door, front-door, IV) before applying the model. This is a strong assumption, as determining the correct causal graph and valid adjustment strategy from domain knowledge alone is often a major challenge in real-world applications. CausalFM automates estimation given these assumptions but offers
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Explainable Artificial Intelligence (XAI) · Advanced Causal Inference Techniques
MethodsCausal inference
