# Interpretable Machine Learning with SHAP Identifies Key Biomarkers in a Multi-Factorial Spectrum of Age-Related Neurological and Metabolic Conditions

**Authors:** Daniil V. Artamonov, Polina I. Popova, Ekaterina A. Korf, Natalia G. Voitenko, Alisa A. Chernysheva, Pavel V. Avdonin, Richard O. Jenkins, Nikolay V. Goncharov

PMC · DOI: 10.3390/ijms27041805 · International Journal of Molecular Sciences · 2026-02-13

## TL;DR

This study uses interpretable machine learning to identify key blood biomarkers for diagnosing age-related neurological and metabolic conditions.

## Contribution

The study introduces a variance-aware workflow with SHAP to identify stable biomarkers in multi-factorial age-related conditions.

## Key findings

- Gradient boosting with SHAP identified iron, transferrin, and glucose as key biomarkers with synergistic interactions.
- Statistical significance and SHAP values showed moderate correlation in feature importance across clinical groups.
- Unsupervised clustering poorly aligned with clinical labels, highlighting the need for interpretable ML in diagnostics.

## Abstract

Vascular and metabolic disorders in the elderly—including acute ischemic stroke (AIS), chronic cerebral circulation insufficiency (CCCI), type 2 diabetes mellitus (DM), and subcortical ischemic vascular dementia (SIVD)—pose a major diagnostic challenge due to their reliance on multi-parameter blood chemistry. In this study, 49 biochemical features were analyzed within a cohort of 120 patients. The application of variance-aware statistical testing revealed that several features (e.g., Fe, Transf, RDW%, LDL) exhibited statistically significant heterogeneity of variance (p < 0.05), which is known to distort standard ANOVA inference. While standard machine-learning (ML) classifiers demonstrated variable performance across clinical groups, a gradient boosting model with restricted tree depth (max depth = 3) achieved high discriminative accuracy, yielding F1-scores between 0.87 and 0.96 across all five clinical classes. Through the use of Shapley Additive Explanations (SHAP), key stable biomarkers including iron (Fe), transferrin, and glucose were identified as having synergistic interactions in model predictions. A comparative analysis of feature importance ranks indicated consistency between statistical significance and SHAP values, with Spearman correlation coefficients reaching 0.53 for groups 1–2 and 0.59 for groups 1–5. Conversely, unsupervised KMeans clustering (k = 5) revealed a poor correspondence with clinical labels, yielding an Adjusted Rand Index (ARI) of 0.198 and Normalized Mutual Information (NMI) of 0.286. These results underscore that statistical structures in biochemical data do not always map to meaningful clinical categories and advocate for the adoption of variance-aware workflows and interpretable ML to enhance diagnostic reliability in aging populations.

## Linked entities

- **Chemicals:** iron (PubChem CID 23925), glucose (PubChem CID 5793)
- **Diseases:** type 2 diabetes mellitus (MONDO:0005148)

## Full-text entities

- **Genes:** VWF (von Willebrand factor) [NCBI Gene 7450] {aka F8VWF, VWD}, LDLR (low density lipoprotein receptor) [NCBI Gene 3949] {aka LDLCQ2}, CALCA (calcitonin related polypeptide alpha) [NCBI Gene 796] {aka CALC1, CGRP, CGRP-I, CGRP-alpha, CGRP1, CT}, BCHE (butyrylcholinesterase) [NCBI Gene 590] {aka BCHED, CHE1, CHE2, E1}, ATHS (atherosclerosis susceptibility (lipoprotein associated)) [NCBI Gene 470] {aka ALP}, PON1 (paraoxonase 1) [NCBI Gene 5444] {aka ESA, MVCD5, PON}, SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}, ADAMTS13 (ADAM metallopeptidase with thrombospondin type 1 motif 13) [NCBI Gene 11093] {aka ADAM-TS13, ADAMTS-13, C9orf8, VWFCP, vWF-CP}, TF (transferrin) [NCBI Gene 7018] {aka HEL-S-71p, PRO1557, PRO2086, TFQTL1}, ALB (albumin) [NCBI Gene 213] {aka FDAHT, HSA, PRO0883, PRO0903, PRO1341}, GTF2E1 (general transcription factor IIE subunit 1) [NCBI Gene 2960] {aka FE, TF2E1, TFIIE-A}
- **Diseases:** white matter hyperintensities (MESH:D056784), gait disturbances (MESH:D020233), skin cancer (MESH:D012878), gastric cancer (MESH:D013274), retinopathy (MESH:D058437), stroke (MESH:D020521), cerebral small vessel disease (MESH:D059345), neurological dysfunction (MESH:D009461), AIS (MESH:D000083242), inflammatory (MESH:D007249), neurodegenerative (MESH:D019636), obliterating disease (MESH:D004194), injury to (MESH:D014947), Vascular and metabolic disorders (MESH:D024821), impaired cerebral autoregulation (MESH:D002547), coronary heart disease (MESH:D003327), cancer (MESH:D009369), DM (MESH:D003920), HD (MESH:D006816), vascular disease (MESH:D014652), polyneuropathy (MESH:D011115), Prediabetes (MESH:D011236), anterior cruciate ligament injuries (MESH:D000070598), CCCI (MESH:D051436), TIA (MESH:D002546), type 2 diabetes (MESH:D003924), heart failure (MESH:D006333), lymph node metastasis (MESH:D008207), executive dysfunction (MESH:D006331), dementia (MESH:D003704), cognitive decline (MESH:D003072), lacunar infarctions (MESH:D059409), infectious diseases (MESH:D003141), Neurological and Metabolic Conditions (MESH:D001928), death (MESH:D003643), Binswanger's disease (MESH:D015140), psychomotor slowing (MESH:D011596), infections (MESH:D007239), cardiovascular diseases (MESH:D002318), ischemic stroke (MESH:D002544), Age- (MESH:D019588)
- **Chemicals:** Fe (MESH:D007501), NEFA (MESH:D005230), Glu (MESH:D018698), ristocetin (MESH:D012310), Ur (MESH:D014529), citrate (MESH:D019343), lipid (MESH:D008055), RANDOX (MESH:C009158), PBS (MESH:D007854), heparin (MESH:D006493), Glucose (MESH:D005947), Ca (MESH:D002118), Chemicals (-), HCT (MESH:D006852)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12941188/full.md

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12941188/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC12941188/full.md

---
Source: https://tomesphere.com/paper/PMC12941188