# Modeling Multivariate Distributions of Lipid Panel Biomarkers for Reference Interval Estimation and Comorbidity Analysis

**Authors:** Julian Velev, Luis Velázquez-Sosa, Jack Lebien, Heeralal Janwa, Abiel Roche-Lima

PMC · DOI: 10.3390/healthcare13192499 · Healthcare · 2025-10-01

## TL;DR

This paper introduces a new method to estimate reference intervals for lipid biomarkers using real-world data, accounting for age, sex, and comorbidities without needing healthy cohorts.

## Contribution

A data-driven approach using Gaussian Mixture Models to derive age- and sex-stratified reference intervals from real-world lab data, incorporating comorbidity networks and selective mortality effects.

## Key findings

- Reference intervals for lipid biomarkers were derived from real-world data, stratified by sex and age.
- Selective survival explained apparent improvements in biomarker profiles after midlife.
- Comorbidity networks revealed strong influences on biomarker ranges and interdependencies between conditions.

## Abstract

Background/Objectives: Laboratory tests are a cornerstone of modern medicine, and their interpretation depends on reference intervals (RIs) that define expected values in healthy populations. Standard RIs are obtained in cohort studies that are costly and time-consuming and typically do not account for demographic factors such as age, sex, and ethnicity that strongly influence biomarker distributions. This study establishes a data-driven approach for deriving RIs directly from routinely collected laboratory results. Methods: Multidimensional joint distributions of lipid biomarkers were estimated from large-scale real-world laboratory data from the Puerto Rican population using a Gaussian Mixture Model (GMM). GMM and additional statistical analyses were used to enable separation of healthy and pathological subpopulations and exclude the influence of comorbidities all without the use of diagnostic codes. Selective mortality patterns were examined to explain counterintuitive age trends in lipid values while comorbidity implication networks were constructed to characterize interdependencies between conditions. Results: The approach yielded sex- and age-stratified RIs for lipid panel biomarkers estimated from the inferred distributions (total cholesterol, LDL, HDL, triglycerides). Apparent improvements in biomarker profiles after midlife were explained by selective survival. Comorbidities exerted pronounced effects on the 95% ranges, with their broader influence captured through network analysis. Beyond fixed limits, the method yields full distributions, allowing each individual result to be mapped to a percentile and interpreted as a continuous measure of risk. Conclusions: Population-specific and sex- and age-segmented RIs can be derived from real-world laboratory data without recruiting healthy cohorts. Incorporating selective mortality effects and comorbidity networks provides additional insight into population health dynamics.

## Full-text entities

- **Chemicals:** triglycerides (MESH:D014280), Lipid (MESH:D008055), cholesterol (MESH:D002784)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12523935/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12523935/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/PMC12523935/full.md

---
Source: https://tomesphere.com/paper/PMC12523935