Differential Distributions: A refined methodology to indirect reference interval estimation by including Patient's health status according to associated ICD-10 codes
David Schär, Tobias U. Blatter, Harald Witte, Jivko Stoyanov, Martin Hersberger, Christos T. Nakas, Alexander B. Leichtle

TL;DR
A new method uses patient health data from ICD-10 codes to create more accurate blood test reference intervals that consider age, sex, and health status.
Contribution
A novel reference interval inference approach that incorporates ICD-10 coding using natural language processing.
Findings
The DDM method adjusts reference intervals dynamically across patient groups based on age and health status.
Reference intervals for potassium levels showed tighter confidence intervals in older adults after excluding results from significantly different subpopulations.
The method reduces standard deviation by filtering out test results from patients with ICD-10 codes indicating significant deviations from the general population.
Abstract
Traditional methods for estimating reference intervals (RIs) using patient's blood test results from the clinical routine, typically remove outliers without considering the nuanced health statuses of patients. This removes a vast majority of test results for reference interval estimation without considering the actual health status of the patient. We introduce the Differential Distribution Method (DDM) which uses laboratory routine data coded with ICD-10 to approximate an underlying non-diseased age and sex stratified population from mixed clinical data. By removing test results that stem from subpopulations significantly different from the general population, reference intervals can be generated stratified by sex and age, taking into account the associated health conditions of the patients as derived by the ICD-10 coding system. Applying the DDM to blood plasma potassium levels…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Medical Coding and Health Information · Statistical Methods in Clinical Trials
