Controllable Generative Sandbox for Causal Inference
Qi Zhang, Harsh Parikh, Ashley Naimi, Razieh Nabi, Christopher Kim, Timothy Lash

TL;DR
CausalMix is a novel variational generative framework that creates realistic synthetic causal data with explicit control over key causal factors, enabling better method validation and study design in causal inference.
Contribution
It introduces CausalMix, which combines Gaussian mixture priors with data-type-specific decoders and causal controls, balancing realism and controllability in synthetic data generation.
Findings
Achieves state-of-the-art metrics on mixed-type tabular data
Provides stable and fine-grained causal control
Demonstrates utility in safety study and power analysis
Abstract
Method validation and study design in causal inference rely on synthetic data with known counterfactuals. Existing simulators trade off distributional realism, the ability to capture mixed-type and multimodal tabular data, against causal controllability, including explicit control over overlap, unmeasured confounding, and treatment effect heterogeneity. We introduce CausalMix, a variational generative framework that closes this gap by coupling a mixture of Gaussian latent priors with data-type-specific decoders for continuous, binary, and categorical variables. The model incorporates explicit causal controls: an overlap regularizer shaping propensity-score distributions, alongside direct parameterizations of confounding strength and effect heterogeneity. This unified objective preserves fidelity to the observed data while enabling factorial manipulation of causal mechanisms, allowing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Generative Adversarial Networks and Image Synthesis · Bayesian Modeling and Causal Inference
