Testing Generalizability in Causal Inference
Daniel de Vassimon Manela, Linying Yang, Robin J. Evans

TL;DR
This paper introduces a formal framework for statistically evaluating the generalizability of high-dimensional causal inference models across diverse domains, addressing a key gap in causal machine learning evaluation.
Contribution
It proposes a systematic, simulation-based approach using frugal parameterization for realistic and comprehensive evaluation of causal inference models' generalizability.
Findings
Framework enables realistic evaluation with real-world data
Statistical testing provides safeguards against over-reliance on traditional metrics
Method supports assessment of both mean and distributional regression models
Abstract
Ensuring robust model performance in diverse real-world scenarios requires addressing generalizability across domains with covariate shifts. However, no formal procedure exists for statistically evaluating generalizability in machine learning algorithms. Existing predictive metrics like mean squared error (MSE) help to quantify the relative performance between models, but do not directly answer whether a model can or cannot generalize. To address this gap in the domain of causal inference, we propose a systematic framework for statistically evaluating the generalizability of high-dimensional causal inference models. Our approach uses the frugal parameterization to flexibly simulate from fully and semi-synthetic causal benchmarks, offering a comprehensive evaluation for both mean and distributional regression methods. Grounded in real-world data, our method ensures more realistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference
MethodsFocus · Causal inference
