Towards Causal Foundation Model: on Duality between Causal Inference and Attention
Jiaqi Zhang, Joel Jennings, Agrin Hilmkil, Nick Pawlowski, Cheng, Zhang, Chao Ma

TL;DR
This paper introduces CInA, a novel self-supervised attention-based method for causal inference that enables zero-shot treatment effect estimation, demonstrating strong generalization across diverse datasets and paving the way for causal foundation models.
Contribution
The paper presents a theoretically justified approach linking attention mechanisms to causal inference, allowing zero-shot causal reasoning with foundation models.
Findings
CInA effectively generalizes to out-of-distribution datasets.
It matches or surpasses traditional methods on real-world data.
Theoretical connection between attention and covariate balancing is established.
Abstract
Foundation models have brought changes to the landscape of machine learning, demonstrating sparks of human-level intelligence across a diverse array of tasks. However, a gap persists in complex tasks such as causal inference, primarily due to challenges associated with intricate reasoning steps and high numerical precision requirements. In this work, we take a first step towards building causally-aware foundation models for treatment effect estimations. We propose a novel, theoretically justified method called Causal Inference with Attention (CInA), which utilizes multiple unlabeled datasets to perform self-supervised causal learning, and subsequently enables zero-shot causal inference on unseen tasks with new data. This is based on our theoretical results that demonstrate the primal-dual connection between optimal covariate balancing and self-attention, facilitating zero-shot causal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Bayesian Modeling and Causal Inference · Explainable Artificial Intelligence (XAI)
MethodsCausal inference
