Machine Learning for Structured Clinical Data
Brett K. Beaulieu-Jones

TL;DR
This paper discusses the challenges of applying machine learning to structured electronic health record data, emphasizing issues like data inconsistency, missing information, and interpretability for clinical use.
Contribution
It highlights key obstacles in using EHR data for machine learning and reviews approaches to address data standardization, missingness, and interpretability.
Findings
Identifies major challenges in applying ML to EHR data.
Reviews methods for handling missing and inconsistent data.
Discusses importance of interpretability in clinical ML applications.
Abstract
Research is a tertiary priority in the EHR, where the priorities are patient care and billing. Because of this, the data is not standardized or formatted in a manner easily adapted to machine learning approaches. Data may be missing for a large variety of reasons ranging from individual input styles to differences in clinical decision making, for example, which lab tests to issue. Few patients are annotated at a research quality, limiting sample size and presenting a moving gold standard. Patient progression over time is key to understanding many diseases but many machine learning algorithms require a snapshot, at a single time point, to create a usable vector form. Furthermore, algorithms that produce black box results do not provide the interpretability required for clinical adoption. This chapter discusses these challenges and others in applying machine learning techniques to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Biomedical Text Mining and Ontologies
