# Interpretable machine learning models for beta thalassemia prediction: an explainable AI approach for smart healthcare 5.0

**Authors:** Maria Abbas, Muhammad Bilal Shoaib Khan, Abdul Hannan Khan, Anas Bilal, Asaad Algarni, Raheem Sarwar

PMC · DOI: 10.3389/fmed.2025.1688645 · Frontiers in Medicine · 2026-01-14

## TL;DR

This paper presents an interpretable AI system for predicting beta thalassemia, using XAI techniques to improve transparency and clinical decision-making.

## Contribution

The novel contribution is an expert system combining LSTM with SHAP and LIME for interpretable beta thalassemia prediction.

## Key findings

- LSTM achieved 99.30% accuracy and high precision, recall, and specificity on the BTC dataset.
- SHAP and LIME were integrated to provide global and local interpretability of model predictions.
- PCA and SMOTE were used to reduce dimensionality and balance the dataset for better model performance.

## Abstract

An inherited blood disorder that bounds the production of beta globin, an important protein that has a handsome contribution in the development of hemoglobin and Red Blood Cells (RBC). This protein also enables cells to carry oxygen to tissues throughout the human body. Genetic variation in hemoglobin beta gene signals the body to make beta globin chains is the cause of beta thelasemia with three major types major, intermediate and minor. There is a need of an expert system for the diagnosis of this particular disease.

This study introduces an interpretable Expert system for the prediction of Beta Thelesemia incorporating Explainable AI (XAI) techniques to enhance clinical needs. Principle component Analysis (PCA) with Synthetic Minority Over-sampling Technique (SMOTE) is applied on the Beta Thalassemia Carrier (BTC) dataset 5066 patients to reduce the dimentiality and balance the output classes. Machime learning classifiers such as Neural Networks, Recurrent Neural Networks and Long Short Term Memory (LSTM) is applied.

The latest one will give the 99.30% accuracy, 99.33% precision, 99.33% recall, 99.33% specificity, and 99.33% f1 score.

Furthermore ensuring the models transparency and interpretability, the proposed method integrates SHapley Ad-ditive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME), enabling both global and local interpretability of model predictions. SHAP gives us insight into important features at the global level, while LIME explains individual predictions, making the model's decisions more comprehensible for clinical applications.

## Linked entities

- **Proteins:** HBB (hemoglobin subunit beta)
- **Diseases:** beta thalassemia (MONDO:0019402)

## Full-text entities

- **Genes:** HBB (hemoglobin subunit beta) [NCBI Gene 3043] {aka CD113t-C, ECYT6, beta-globin}
- **Diseases:** Beta Thalassemia (MESH:D017086), inherited blood disorder (MESH:D025861)
- **Chemicals:** oxygen (MESH:D010100)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12847264/full.md

## Figures

21 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12847264/full.md

## References

62 references — full list in the complete paper: https://tomesphere.com/paper/PMC12847264/full.md

---
Source: https://tomesphere.com/paper/PMC12847264