Privacy Preserving Data Imputation via Multi-party Computation for Medical Applications
Julia Jentsch, Ali Burak \"Unal, \c{S}eyma Selcan Ma\u{g}ara, Mete, Akg\"un

TL;DR
This paper presents privacy-preserving data imputation techniques using secure multi-party computation tailored for medical datasets, enabling accurate data completion without compromising patient privacy.
Contribution
It introduces secure multi-party computation implementations of mean, median, regression, and kNN imputation methods specifically for sensitive healthcare data.
Findings
Methods closely match plaintext imputation accuracy
All methods scale linearly with sample size
Except for kNN, methods are suitable for large datasets
Abstract
Handling missing data is crucial in machine learning, but many datasets contain gaps due to errors or non-response. Unlike traditional methods such as listwise deletion, which are simple but inadequate, the literature offers more sophisticated and effective methods, thereby improving sample size and accuracy. However, these methods require accessing the whole dataset, which contradicts the privacy regulations when the data is distributed among multiple sources. Especially in the medical and healthcare domain, such access reveals sensitive information about patients. This study addresses privacy-preserving imputation methods for sensitive data using secure multi-party computation, enabling secure computations without revealing any party's sensitive information. In this study, we realized the mean, median, regression, and kNN imputation methods in a privacy-preserving way. We specifically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security
