# Development and validation of a machine learning model for predicting hypersplenism in Wilson disease patients

**Authors:** Qiaoyu Xuan, Xiuquan Shi, Lei Jin, Daiping Hua, Lanting Sun, Wenming Yang, Han Wang

PMC · DOI: 10.3389/fmed.2026.1768024 · Frontiers in Medicine · 2026-03-18

## TL;DR

This study develops a machine learning model to predict hypersplenism in Wilson disease patients using clinical indicators, enabling early identification and better management of high-risk individuals.

## Contribution

The novel contribution is an SVM-based predictive model for hypersplenism in Wilson disease, validated with clinical data and SHAP analysis for interpretability.

## Key findings

- The SVM model achieved high AUC (0.867) and good calibration for predicting hypersplenism in Wilson disease patients.
- PIIINP was identified as the core predictive feature, followed by WBC, PLT, and A/G.
- The model provides a quantitative basis for risk stratification and early intervention in high-risk WD patients.

## Abstract

Wilson disease (WD) is a rare autosomal recessive copper metabolism disorder, with hypersplenism as a severe, common complication secondary to disease-related cirrhosis. Currently, there is a lack of precise early prediction tools for this complication. This study aimed to construct a hypersplenism prediction model for WD patients by integrating multidimensional clinical indicators and machine learning, providing references for early identification of high-risk individuals and personalized interventions.

A total of 524 WD patients were enrolled at the First Affiliated Hospital of Anhui University of Chinese Medicine from December 2019 to February 2025, including 244 with hypersplenism (HG) and 280 without (non-HG). After Key variables were selected through LASSO regression feature selection. Variate multicollinearity within the model was assessed using variance inflation factors (VIF). The predictive model was visualized using a nomogram. Five machine learning models were built with 10-fold cross-validation for parameter optimization. Finally, the model performance was evaluated, and the feature contributions were explained using the SHapley Additive exPlanations (SHAP) method.

Compared with the non-HG group, the HG group had significantly lower WBC, PLT, and ceruloplasmin (CER), and higher A/G, PIIINP, CIV, hyaluronic acid (HA), laminin (LN), and 24-h urinary copper (CUU) (all p < 0.05). Multivariate logistic regression showed A/G, CIV, and PIIINP were independent risk factors, while WBC and PLT were independent protective factors. The SVM model performed best: training set AUC = 0.867 (95% CI: 0.830–0.904), accuracy = 0.807, specificity = 0.856, precision = 0.812, F1 score = 0.771; test set AUC = 0.771 (95% CI: 0.699–0.844) with AUC decay <10%. It also had excellent calibration (training set Brier score = 0.146, test set = 0.206) and clinical utility via DCA. SHAP analysis identified PIIINP as the core predictive feature, followed by WBC, PLT, and A/G, with CIV having relatively weaker influence.

The SVM-based predictive model exhibits superior discriminatory power, calibration accuracy, and clinical utility for hypersplenism in WD patients. The five key features (WBC, PLT, A/G, CIV, PIIINP) with PIIINP as the core provide an objective quantitative basis for risk stratification, facilitating early identification and precise intervention of high-risk patients and improving WD prognosis.

## Linked entities

- **Diseases:** Wilson disease (MONDO:0010200), hypersplenism (MONDO:0006795)

## Full-text entities

- **Genes:** CP (ceruloplasmin) [NCBI Gene 1356] {aka AB073614, CP-2}
- **Diseases:** cirrhosis (MESH:D005355), hypersplenism (MESH:D006971), WD (MESH:D006527), autosomal recessive copper metabolism disorder (MESH:C535468)
- **Chemicals:** copper (MESH:D003300)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13038950/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13038950/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/PMC13038950/full.md

---
Source: https://tomesphere.com/paper/PMC13038950