RIDDLE: Race and ethnicity Imputation from Disease history with Deep   LEarning

Ji-Sung Kim; Xin Gao; Andrey Rzhetsky

arXiv:1707.01623·q-bio.QM·May 1, 2018

RIDDLE: Race and ethnicity Imputation from Disease history with Deep LEarning

Ji-Sung Kim, Xin Gao, Andrey Rzhetsky

PDF

1 Repo

TL;DR

This paper introduces RIDDLE, a deep learning method that accurately imputes race and ethnicity from electronic medical records, outperforming traditional models and revealing insights into disease patterns across different groups.

Contribution

The study presents a novel deep neural network approach for race and ethnicity imputation from medical histories, with improved accuracy and interpretability over existing methods.

Findings

01

RIDDLE significantly outperforms logistic regression and random forest in accuracy and AUC.

02

Interpretable features reveal medical indicators predictive of race and ethnicity.

03

Imputed race and ethnicity help uncover differential disease patterns.

Abstract

Anonymized electronic medical records are an increasingly popular source of research data. However, these datasets often lack race and ethnicity information. This creates problems for researchers modeling human disease, as race and ethnicity are powerful confounders for many health exposures and treatment outcomes; race and ethnicity are closely linked to population-specific genetic variation. We showed that deep neural networks generate more accurate estimates for missing racial and ethnic information than competing methods (e.g., logistic regression, random forest). RIDDLE yielded significantly better classification performance across all metrics that were considered: accuracy, cross-entropy loss (error), and area under the curve for receiver operating characteristic plots (all $p < 1 0^{- 6}$ ). We made specific efforts to interpret the trained neural network models to identify,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jisungk/riddle
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.