Evaluation of Embeddings of Laboratory Test Codes for Patients at a Cancer Center
Lorenzo A. Rossi, Chad Shawber, Janet Munu, Finly Zachariah

TL;DR
This study trains and evaluates embeddings of laboratory test codes from EHR data at a cancer center, assessing their clinical relevance and utility in mortality prediction.
Contribution
It introduces embedding representations for LOINC codes, including outcome information, and evaluates their clinical similarity and predictive performance.
Findings
Embeddings show meaningful clinical similarities.
Including outcome info improves mortality prediction.
Embeddings preserve ordinality of test results.
Abstract
Laboratory test results are an important and generally high dimensional component of a patient's Electronic Health Record (EHR). We train embedding representations (via Word2Vec and GloVe) for LOINC codes of laboratory tests from the EHRs of about 80,000 patients at a cancer center. To include information about lab test outcomes, we also train embeddings on the concatenation of a LOINC code with a symbol indicating normality or abnormality of the result. We observe several clinically meaningful similarities among LOINC embeddings trained over our data. For the embeddings of the concatenation of LOINCs with abnormality codes, we evaluate the performance for mortality prediction tasks and the ability to preserve ordinality properties: i.e. a lab test with normal outcome should be more similar to an abnormal one than to the a very abnormal one.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare
