Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records
Jingqing Zhang, Xiaoyu Zhang, Kai Sun, Xian Yang, Chengliang Dai, Yike, Guo

TL;DR
This paper introduces an unsupervised deep learning method that leverages semantic latent representations and the Human Phenotype Ontology to accurately and efficiently annotate phenotypic abnormalities in large-scale electronic health records.
Contribution
It presents a novel unsupervised framework that improves phenotypic annotation accuracy and efficiency using semantic latent representations and ontology integration.
Findings
Achieves state-of-the-art annotation performance on MIMIC-III data
Demonstrates high computational efficiency
Effectively standardizes phenotypic annotations using HPO
Abstract
The extraction of phenotype information which is naturally contained in electronic health records (EHRs) has been found to be useful in various clinical informatics applications such as disease diagnosis. However, due to imprecise descriptions, lack of gold standards and the demand for efficiency, annotating phenotypic abnormalities on millions of EHR narratives is still challenging. In this work, we propose a novel unsupervised deep learning framework to annotate the phenotypic abnormalities from EHRs via semantic latent representations. The proposed framework takes the advantage of Human Phenotype Ontology (HPO), which is a knowledge base of phenotypic abnormalities, to standardize the annotation results. Experiments have been conducted on 52,722 EHRs from MIMIC-III dataset. Quantitative and qualitative analysis have shown the proposed framework achieves state-of-the-art annotation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Machine Learning in Healthcare
