Evaluation of the Synthetic Electronic Health Records

Emily Muller; Xu Zheng; Jer Hayes

arXiv:2210.08655·cs.LG·October 18, 2022

Evaluation of the Synthetic Electronic Health Records

Emily Muller, Xu Zheng, Jer Hayes

PDF

Open Access

TL;DR

This paper introduces new metrics for evaluating synthetic electronic health records, addressing privacy and utility concerns, and demonstrates their effectiveness in comparing generative models for medical data synthesis.

Contribution

It proposes two novel metrics, Similarity and Uniqueness, for assessing synthetic EHR data quality and privacy, enhancing model comparison beyond traditional utility measures.

Findings

01

Metrics effectively evaluate synthetic data quality.

02

Proposed metrics distinguish between different generative models.

03

Synthetic EHRs maintain data utility while considering privacy.

Abstract

Generative models have been found effective for data synthesis due to their ability to capture complex underlying data distributions. The quality of generated data from these models is commonly evaluated by visual inspection for image datasets or downstream analytical tasks for tabular datasets. These evaluation methods neither measure the implicit data distribution nor consider the data privacy issues, and it remains an open question of how to compare and rank different generative models. Medical data can be sensitive, so it is of great importance to draw privacy concerns of patients while maintaining the data utility of the synthetic dataset. Beyond the utility evaluation, this work outlines two metrics called Similarity and Uniqueness for sample-wise assessment of synthetic datasets. We demonstrate the proposed notions with several state-of-the-art generative models to synthesise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEpigenetics and DNA Methylation