Categorical EHR Imputation with Generative Adversarial Nets

Yinchong Yang; Zhiliang Wu; Volker Tresp; Peter A. Fasching

arXiv:2108.01701·cs.LG·August 6, 2021

Categorical EHR Imputation with Generative Adversarial Nets

Yinchong Yang, Zhiliang Wu, Volker Tresp, Peter A. Fasching

PDF

TL;DR

This paper introduces a new GAN-based method for imputing missing categorical data in electronic health records, improving prediction accuracy over traditional methods.

Contribution

The paper proposes a novel approach to stabilize adversarial training for categorical data imputation using GANs, addressing a key challenge in EHR data processing.

Findings

01

Significant improvement in prediction accuracy with the proposed method

02

Effective stabilization of adversarial training for categorical features

03

Demonstrated success on real-world EHR datasets

Abstract

Electronic Health Records often suffer from missing data, which poses a major problem in clinical practice and clinical studies. A novel approach for dealing with missing data are Generative Adversarial Nets (GANs), which have been generating huge research interest in image generation and transformation. Recently, researchers have attempted to apply GANs to missing data generation and imputation for EHR data: a major challenge here is the categorical nature of the data. State-of-the-art solutions to the GAN-based generation of categorical data involve either reinforcement learning, or learning a bidirectional mapping between the categorical and the real latent feature space, so that the GANs only need to generate real-valued features. However, these methods are designed to generate complete feature vectors instead of imputing only the subsets of missing features. In this paper we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.