Combining observational and experimental data for causal inference considering data privacy
Charlotte Z. Mann, Adam C. Sales, Johann A. Gagnon-Bartsch

TL;DR
This paper investigates how to combine privacy-preserving observational data with experimental data to improve causal inference, balancing data utility and privacy risks through various data transformation techniques.
Contribution
It introduces methods for using privacy-preserving transformations of observational data to enhance treatment effect estimation when combining with experimental data.
Findings
Transformed observational data can improve treatment effect estimates.
Privacy-utility trade-off depends on the transformation technique.
Leveraging privacy-preserving data enhances causal inference accuracy.
Abstract
Combining observational and experimental data for causal inference can improve treatment effect estimation. However, many observational data sets cannot be released due to data privacy considerations, so one researcher may not have access to both experimental and observational data. Nonetheless, a small amount of risk of disclosing sensitive information might be tolerable to organizations that house confidential data. In these cases, organizations can employ data privacy techniques, which decrease disclosure risk, potentially at the expense of data utility. In this paper, we explore disclosure limiting transformations of observational data, which can be combined with experimental data to estimate the sample and population average treatment effects. We consider leveraging observational data to improve generalizability of treatment effect estimates when a randomized experiment (RCT) is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Privacy-Preserving Technologies in Data · Statistical Methods and Bayesian Inference
