# Differential Distributions: A refined methodology to indirect reference interval estimation by including Patient's health status according to associated ICD-10 codes

**Authors:** David Schär, Tobias U. Blatter, Harald Witte, Jivko Stoyanov, Martin Hersberger, Christos T. Nakas, Alexander B. Leichtle

PMC · DOI: 10.1016/j.plabm.2025.e00492 · 2025-07-09

## TL;DR

A new method uses patient health data from ICD-10 codes to create more accurate blood test reference intervals that consider age, sex, and health status.

## Contribution

A novel reference interval inference approach that incorporates ICD-10 coding using natural language processing.

## Key findings

- The DDM method adjusts reference intervals dynamically across patient groups based on age and health status.
- Reference intervals for potassium levels showed tighter confidence intervals in older adults after excluding results from significantly different subpopulations.
- The method reduces standard deviation by filtering out test results from patients with ICD-10 codes indicating significant deviations from the general population.

## Abstract

Traditional methods for estimating reference intervals (RIs) using patient's blood test results from the clinical routine, typically remove outliers without considering the nuanced health statuses of patients. This removes a vast majority of test results for reference interval estimation without considering the actual health status of the patient.

We introduce the Differential Distribution Method (DDM) which uses laboratory routine data coded with ICD-10 to approximate an underlying non-diseased age and sex stratified population from mixed clinical data. By removing test results that stem from subpopulations significantly different from the general population, reference intervals can be generated stratified by sex and age, taking into account the associated health conditions of the patients as derived by the ICD-10 coding system.

Applying the DDM to blood plasma potassium levels demonstrated its ability to adjust RIs dynamically across different patient groups. The method effectively differentiated RIs in a decade-based stratification, showing significant variability and tighter confidence intervals, particularly in older (above 60 years old) adults. The RIs were slightly wider with advancing age in both males and females, while their standard deviation was reduced by removing large portions of test results differing significantly, grouped by either their individual ICD-10 code or clusters of ICD-10 codes.

This DDM data mining approach offers a robust framework for RI inference by generating adjusted RIs that incorporate clinical nuances reflected in ICD-10 codes. This approach not only enhances the accuracy of patient diagnostics but also facilitates the identification of potential multimorbidities affecting laboratory results.

•Introducing a novel reference interval inference approach incorporating ICD-10 coding.•The ICD-10 codes are included in the inference using natural language processing.•The resulting reference intervals are adjusted by their age, sex and, their health status.•These reference intervals provide helpful substitutes for elderly or multimorbid patients.

Introducing a novel reference interval inference approach incorporating ICD-10 coding.

The ICD-10 codes are included in the inference using natural language processing.

The resulting reference intervals are adjusted by their age, sex and, their health status.

These reference intervals provide helpful substitutes for elderly or multimorbid patients.

## Full-text entities

- **Chemicals:** potassium (MESH:D011188)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12275890/full.md

---
Source: https://tomesphere.com/paper/PMC12275890