# Machine learning improves detection of alpha thalassemia carriers compared to clinical features

**Authors:** Elmira Mohammadi, Mohsen Rastegar, Amir Jamshidnezhad, Amirabbas Azizi

PMC · DOI: 10.1038/s41598-025-20605-6 · Scientific Reports · 2025-10-21

## TL;DR

Machine learning models outperform traditional methods in identifying alpha thalassemia carriers using blood data.

## Contribution

A machine learning framework is proposed for accurate classification of alpha thalassemia types using hematological parameters.

## Key findings

- A stacking ensemble model achieved 94% accuracy in classifying alpha thalassemia carriers.
- RBC count, MCV, MCH, and MCHC were key predictors for classification.
- RBC indices showed strong correlations, while PLT and WBC parameters had moderate associations.

## Abstract

Alpha-thalassemia is a widespread genetic disorder, and accurately distinguishing between alpha-plus (α⁺) and alpha-zero (α⁰) types is critical for effective screening and management. This study developed and evaluated machine learning models to classify α⁺ and α⁰ carriers based on hematological parameters. A dataset of 956 cases was analyzed, including variables such as red blood cell (RBC) count, hemoglobin (Hb) level, and RBC indices. Feature selection identified the most predictive markers, and five machine learning models were trained and compared. The stacking ensemble model demonstrated the best performance, achieving 94% accuracy and a high F1-score. Key predictors included RBC count, mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), and mean corpuscular hemoglobin concentration (MCHC). Correlation analysis revealed strong interrelationships among RBC indices, while platelet (PLT) and white blood cell (WBC) parameters had moderate associations. These findings suggest that machine learning, particularly ensemble methods, can enhance the detection of alpha-thalassemia carriers. The development of models based on both data-driven and clinical features provides a flexible framework for screening and could support more personalized approaches in future research.

## Linked entities

- **Diseases:** alpha-thalassemia (MONDO:0011399)

## Full-text entities

- **Diseases:** genetic disorder (MESH:D030342), Alpha-thalassemia (MESH:D017085)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12541011/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12541011/full.md

## References

11 references — full list in the complete paper: https://tomesphere.com/paper/PMC12541011/full.md

---
Source: https://tomesphere.com/paper/PMC12541011