Foundation Models for Causal Inference via Prior-Data Fitted Networks

Yuchen Ma; Dennis Frauen; Emil Javurek; Stefan Feuerriegel

arXiv:2506.10914·cs.LG·February 25, 2026

Foundation Models for Causal Inference via Prior-Data Fitted Networks

Yuchen Ma, Dennis Frauen, Emil Javurek, Stefan Feuerriegel

PDF

Open Access 3 Reviews

TL;DR

This paper introduces CausalFM, a framework for training foundation models using prior-data fitted networks to perform Bayesian causal inference across various settings, demonstrating competitive in-context learning performance.

Contribution

It formalizes Bayesian priors for causal inference using structural causal models and develops causality-inspired Bayesian neural networks within the PFN framework.

Findings

01

CausalFM achieves competitive in-context learning performance.

02

The framework generalizes to multiple causal inference settings.

03

It offers a new paradigm for causal inference in various disciplines.

Abstract

Prior-data fitted networks (PFNs) have recently been proposed as a promising way to train tabular foundation models. PFNs are transformers that are pre-trained on synthetic data generated from a prespecified prior distribution and that enable Bayesian inference through in-context learning. In this paper, we introduce CausalFM, a comprehensive framework for training PFN-based foundation models in various causal inference settings. First, we formalize the construction of Bayesian priors for causal inference based on structural causal models (SCMs) in a principled way and derive necessary criteria for the validity of such priors. Building on this, we propose a novel family of prior distributions using causality-inspired Bayesian neural networks that enable CausalFM to perform Bayesian causal inference in various settings, including for back-door, front-door, and instrumental variable…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 5

Strengths

- The paper introduces a novel approach by leveraging prior-data fitted networks for causal inference, which might be the first to provide a comprehensive foundation model (CausalFM) covering multiple settings in one framework. Unlike past works that focus on a single identification strategy (e.g. only back-door criteria) or require task-specific models, CausalFM uses SCM-based priors to train a transformer that can flexibly handle back-door, front-door, and IV adjustments within one model. This

Weaknesses

- While CausalFM is conceptually flexible, there may be practical scalability challenges. Training a PFN of this sort involves generating and learning from a very large number of synthetic datasets drawn from complex SCM priors, which is computationally intensive. The transformer model itself has a fixed context length and model size – this could limit the scale of datasets it can handle at test time (e.g. number of samples or covariates) unless the architecture is scaled up. The paper does not

Reviewer 02Rating 6Confidence 4

Strengths

Beyond dataset-specific causal inference to in-context causal inference is a promising and important research direction. In the paper, the authors propose a framework which is not restricted to a single causal setting. It supports back-door, front-door, and instrumental variable scenarios within a single model. The introduction of the "well-specified prior" concept and the argument for incorporating identifiability assumptions into the prior are valuable theoretical insights. The results on s

Weaknesses

The pre-training methodology in this paper appears largely similar to exsiting work in this area The proof sketches are quite dense and lacks intuition, then are hard to follow. The 24-hour training time on an A100 GPU is mentioned, but there is no discussion of inference speed or comparative training costs of the baselines.

Reviewer 03Rating 8Confidence 4

Strengths

- The paper provides a clear theoretical argument for why PFN priors in this Bayesian approximation context should enforce identifiability for consistent estimation (I am a little skeptical if we should make this assumption, see below - but realizing and demonstrating it is very valuable). - Introduces a structured way to build priors over SCMs using BNNs that respect the assumed causal graph structure and identifiability conditions. - CausalFM is designed to handle back-door, front-door, and IV

Weaknesses

- Reliance on Correct Identifiability Assumptions: The framework's core premise requires the user to correctly identify the true causal structure and select the appropriate identifiability strategy (back-door, front-door, IV) before applying the model. This is a strong assumption, as determining the correct causal graph and valid adjustment strategy from domain knowledge alone is often a major challenge in real-world applications. CausalFM automates estimation given these assumptions but offers

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Explainable Artificial Intelligence (XAI) · Advanced Causal Inference Techniques

MethodsCausal inference