Differentially Private Normalizing Flows for Density Estimation, Data Synthesis, and Variational Inference with Application to Electronic Health Records
Bingyue Su, Yu Wang, Daniele E. Schiavazzi, Fang Liu

TL;DR
This paper introduces a differentially private normalizing flow approach for density estimation and data synthesis in sensitive datasets like electronic health records, enabling privacy-preserving analysis with good utility.
Contribution
It develops a novel method combining normalizing flows with differential privacy guarantees for density estimation and synthetic data generation, applied to EHR data.
Findings
Synthetic data maintains utility for predictive tasks.
Differentially private VI can alter correlation structures.
Alternative loss functions may improve privacy-utility trade-offs.
Abstract
Electronic health records (EHR) often contain sensitive medical information about individual patients, posing significant limitations to sharing or releasing EHR data for downstream learning and inferential tasks. We use normalizing flows (NF), a family of deep generative models, to estimate the probability density of a dataset with differential privacy (DP) guarantees, from which privacy-preserving synthetic data are generated. We apply the technique to an EHR dataset containing patients with pulmonary hypertension. We assess the learning and inferential utility of the synthetic data by comparing the accuracy in the prediction of the hypertension status and variational posterior distribution of the parameters of a physics-based model. In addition, we use a simulated dataset from a nonlinear model to compare the results from variational inference (VI) based on privacy-preserving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Machine Learning in Healthcare · Insurance, Mortality, Demography, Risk Management
MethodsNormalizing Flows · Variational Inference
