Secure distributed multiple imputation enables missing data inference for private data proprietors
Haris Smajlović, Yi Lian, Qi Long, Ibrahim Numanagić, Xiaoqian Jiang

TL;DR
This paper introduces a secure method for imputing missing data in private health records across multiple institutions without compromising privacy.
Contribution
The novel contribution is a provably secure distributed imputation framework using secure multiparty computation for collaborative analysis of private EHRs.
Findings
The proposed method achieves practical runtimes and accuracy comparable to non-secure imputation techniques.
It enables collaborative studies on incomplete, private datasets without centralized data pooling.
The framework improves classification of high-risk ICU patient outcomes using real-world data.
Abstract
Scattered between many healthcare providers across the US, Electronic Health Records (EHR) are extensively used for research purposes. Collaboration and sharing of EHRs between multiple institutions often provide access to more diverse datasets and a chance to conduct comprehensive studies. However, these collaboration efforts are usually hindered by privacy issues that render the pooling of such data at a centralized database impossible. Furthermore, EHRs are often incomplete and require statistical imputation prior to the study. To enable collaborative studies on top of incomplete, private EHRs, here we provide a provably secure solution built with secure multiparty computation (SMC) that provides practical runtimes and accuracy on par with the state-of-the-art, non-secure equivalents. Our solution enables the utilization of distributed datasets as a whole to impute the missing data…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Electronic Health Records Systems
