Linear mixed modelling of federated data when only the mean, covariance, and sample size are available
Marie Analiz April Limpoco, Christel Faes, Niel Hens

TL;DR
This paper introduces a novel method for fitting linear mixed models using only summary statistics from multiple data sources, preserving privacy and reducing communication compared to federated learning.
Contribution
The proposed framework achieves equivalent estimates to individual-level data analysis using only mean, covariance, and sample size, simplifying federated data analysis.
Findings
Achieves identical estimates to individual data analysis.
Requires only one round of summary statistic sharing.
Demonstrated on real patient data from 70 clinics.
Abstract
In medical research, individual-level patient data provide invaluable information, but the patients' right to confidentiality remains of utmost priority. This poses a huge challenge when estimating statistical models such as linear mixed models, which is an extension of linear regression models that can account for potential heterogeneity whenever data come from different data providers. Federated learning algorithms tackle this hurdle by estimating parameters without retrieving individual-level data. Instead, iterative communication of parameter estimate updates between the data providers and analyst is required. In this paper, we propose an alternative framework to federated learning algorithms for fitting linear mixed models. Specifically, our approach only requires the mean, covariance, and sample size of multiple covariates from different data providers once. Using the principle of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference
