# Establishment and Validation of Serum Ferritin Reference Intervals Based on Real-World Big Data and Multi-Strategy Partitioning Algorithms

**Authors:** Yixin Xu, Xiaojuan Wu, Junlong Zhang, Qian Niu, Bei Cai, Qiang Miao

PMC · DOI: 10.3390/jcm15030976 · Journal of Clinical Medicine · 2026-01-26

## TL;DR

This study establishes accurate reference intervals for serum ferritin levels using big data and advanced statistical methods, improving diagnostic accuracy for a local population.

## Contribution

A novel multi-strategy partitioning framework for deriving population-specific serum ferritin reference intervals using real-world data and decision tree analysis.

## Key findings

- Males had significantly higher serum ferritin concentrations than females.
- Age was significantly associated with serum ferritin in females but not in males.
- Study-derived reference intervals outperformed manufacturer-provided intervals in validation.

## Abstract

Background/Objectives: We aimed to establish and validate population-based reference intervals (RIs) for serum ferritin (SF) using an indirect, date-driven approach based on real-world laboratory data and to optimize partitioning strategies. Methods: SF results from 29,723 apparently healthy individuals who underwent health examinations at West China Hospital between 2020 and 2024 were retrospectively analyzed. SF was measured on a Roche Cobas e801 electrochemiluminescence immunoassay platform. After Box–Cox transformation, outliers were removed using an iterative Tukey method. Potential partitioning factors were evaluated, and data-driven age cut-points were explored using decision tree regression and verified with the Harris–Boyd criteria. RIs were estimated using nonparametric percentile methods and validated in an independent cohort of 2494 individuals. Results: SF concentrations were significantly higher in males than in females (p < 0.001). In females, SF showed a significant positive association with age (r = 0.466, p < 0.001), whereas no such association was observed in males. Decision tree analysis identified 50 years as the optimal age cut-off for females (R2 = 0.2467). The final study-derived RIs were 98.02–997.78 µg/L for males, 10.30–299.55 µg/L for females ≤ 50 years, and 36.61–507.00 µg/L for females > 50 years. In the validation cohort, the study-derived RIs achieved pass rates of 93.83–94.72%, which were significantly higher than the manufacturer-provided RIs (37.12–73.97%, all p < 0.001). Conclusions: Using a large health examination database and a multi-step partitioning strategy, we established robust sex- and age-specific SF RIs on the Roche Cobas e801 platform for the local population. This work provides a reproducible, generalizable framework for indirect RI determination of other biomarkers.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12898406/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12898406/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/PMC12898406/full.md

---
Source: https://tomesphere.com/paper/PMC12898406