A Generative Framework for Causal Estimation via Importance-Weighted Diffusion Distillation
Xinran Song, Tianyu Chen, Mingyuan Zhou

TL;DR
This paper introduces IWDD, a new generative framework combining diffusion models and importance weighting to improve causal effect estimation from observational data, reducing variance and enhancing prediction accuracy.
Contribution
It presents a novel importance-weighted diffusion distillation method that simplifies computation and reduces variance in causal inference tasks, advancing deep learning approaches in this field.
Findings
Achieves state-of-the-art out-of-sample prediction performance
Significantly improves causal effect estimation accuracy
Reduces variance of gradient estimates in causal inference
Abstract
Estimating individualized treatment effects from observational data is a central challenge in causal inference, largely due to covariate imbalance and confounding bias from non-randomized treatment assignment. While inverse probability weighting (IPW) is a well-established solution to this problem, its integration into modern deep learning frameworks remains limited. In this work, we propose Importance-Weighted Diffusion Distillation (IWDD), a novel generative framework that combines the pretraining of diffusion models with importance-weighted score distillation to enable accurate and fast causal estimation-including potential outcome prediction and treatment effect estimation. We demonstrate how IPW can be naturally incorporated into the distillation of pretrained diffusion models, and further introduce a randomization-based adjustment that eliminates the need to compute IPW…
Peer Reviews
Decision·Submitted to ICLR 2026
The proposed idea is original. Also, the paper has a clear structure.
The method relies on the core idea that we can substitute the IPWs with the randomization-based adjustment (i.e., shuffling the covariates and treatment assignment). Yet, by doing so, we cannot use the observed outcomes from the original dataset (as they originate from P(X, Z, Y) and not from P(X)P(Z)P(Y|X, Z)). The paper also does not clearly explain what sample is being used for the distillation, so I assume the pre-trained diffusion model was used to sample from both of the potential outcomes
- The idea to use deep learning techniques such as diffusion and distillation on IPW is innovative, which arguably hasn't been done before - The paper shows details of how diffusion and distillation are built into the algorithm, and shows the rationale - The paper shows good results on the increasingly popular ACIC datasets
- The diffusion/distillation them algorithms are popular algorithms, while it's uncertain how much impact the algorithm has, because it's only applied on the dealing of confounding factors, or IPW - The paper may be missing key ablation studies, e.g., IPW itself uses one-layer regression, which itself is a shallow ML technique. What if we use an MLP or other deeper models without diffusion or distillation? Ablation to see the contribution to see each technique will be helpful to boost the paper
- Adapting diffusion models for causal effect estimation is an interesting research idea, allowing for going beyond pure point estimation of CATEs but also allowing for estimating the conditional potential outcomes distribution, which then can be used to estimate different causal quantities. - Large parts of the paper are well written, and the flow of the paper is easy to follow. - The authors provide code for the reproducibility of their results.
**The contribution/novelty and practicality of the method seem rather limited.** - The idea of using diffusion models for potential outcomes estimation while tackling distribution shift/treatment selection bias is not new and has already been explored e.g. by DiffPO. Thus, the major contribution of this paper is avoiding using the IPW term and instead performing importance weighting via sampling. The applied adaptations and performed experiments are also limited and could be improved clearly (se
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Functional Brain Connectivity Studies · Mental Health Research Topics
MethodsDiffusion
