Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding
Maximilian Hahn, Alina Zajak, Dominik Heider, Ad\`ele Helena Ribeiro

TL;DR
This paper introduces fedCI and fedCI-IOD, innovative federated methods for causal discovery across heterogeneous datasets that preserve privacy, handle diverse variable types, and outperform traditional approaches in statistical power.
Contribution
The paper presents the first federated causal discovery method capable of handling heterogeneous datasets with latent confounding and various variable types, advancing privacy-preserving causal analysis.
Findings
fedCI effectively handles diverse variable types and heterogeneity.
fedCI-IOD enables federated causal discovery under latent confounding.
Methods achieve performance comparable to pooled analyses.
Abstract
Causal discovery across multiple datasets is often constrained by data privacy regulations and cross-site heterogeneity, limiting the use of conventional methods that require a single, centralized dataset. To address these challenges, we introduce fedCI, a federated conditional independence test that rigorously handles heterogeneous datasets with non-identical sets of variables, site-specific effects, and mixed variable types, including continuous, ordinal, binary, and categorical variables. At its core, fedCI uses a federated Iteratively Reweighted Least Squares (IRLS) procedure to estimate the parameters of generalized linear models underlying likelihood-ratio tests for conditional independence. Building on this, we develop fedCI-IOD, a federated extension of the Integration of Overlapping Datasets (IOD) algorithm, that replaces its meta-analysis strategy and enables, for the fist…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Advanced Causal Inference Techniques · Privacy-Preserving Technologies in Data
