# Value of an automated machine learning model with post-hoc explanation for predicting healthcare-seeking delays among residents in Tibetan regions

**Authors:** Zhenzhong Xi, Chenxing Meng, Qian Li, Yisha Xu, Peng Wu, Zhigang Zhang, Tingyong Han, Liangjie Zhang, Xinxuan Han

PMC · DOI: 10.3389/fpubh.2026.1682879 · Frontiers in Public Health · 2026-02-17

## TL;DR

This study uses automated machine learning and SHAP explanations to predict healthcare-seeking delays among Tibetan residents and identify key factors influencing these delays.

## Contribution

The novel contribution is the integration of AutoML with post-hoc SHAP interpretation for predicting healthcare-seeking delays in a Tibetan population.

## Key findings

- The LightGBM model achieved an AUC > 0.86, outperforming conventional approaches.
- SHAP analysis identified age and hospital quality as top predictors of healthcare-seeking delays.
- A clinical decision support system was developed to optimize resource allocation in Tibetan healthcare.

## Abstract

This study aimed to investigate key determinants of healthcare-seeking delays among Tibetan residents and develop predictive models using automated machine learning (AutoML) with post-hoc SHAP interpretation alongside a clinical decision support system.

Face-to-face surveys using structured questionnaires were administered to 1,879 Tibetan residents. Data processing employed an AutoML framework: datasets were partitioned into training (n = 1,503) and testing (n = 376) subsets at an 8:2 ratio. Standardized preprocessing—including outlier rectification, one-hot encoding (OHE), and random forest-based multiple imputation (MI)—was applied. Model validation integrated 5-fold cross-validation and SHapley Additive exPlanations (SHAP) analysis.

Among 1,879 participants, the healthcare-seeking delay incidence was 41.99%. The LightGBM model significantly outperformed conventional approaches (AUC > 0.86). SHAP feature importance analysis revealed the predictor hierarchy: Age > County hospital quality score > Distance to county hospital > Township health center quality score > Able to communicate in Chinese.

A high-performance model with post-hoc SHAP interpretation accurately identifies geographical, cultural, and healthcare resource variables to accurately identify high-risk populations. The developed clinical decision support system enables risk computation through modular interfaces, providing an evidence-based tool for optimizing hierarchical diagnosis and resource allocation in Tibetan healthcare.

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** Musculoskeletal (MESH:D009140), discharge (MESH:D019522), palpitations (MESH:D006331), Chronic disease (MESH:D002908), Symptom (MESH:D012816), back pain (MESH:D001416), stiffness (MESH:C566112), Gastrointestinal (MESH:D005767), dizziness (MESH:D004244), joint pain (MESH:D018771), cough (MESH:D003371), weight loss (MESH:D015431), nausea (MESH:D009325), hypoxic (MESH:D002534), diarrhea (MESH:D003967), chest pain (MESH:D002637), Neurological (MESH:D009461), vomiting (MESH:D014839), headache (MESH:D006261), respiratory diseases (MESH:D012140), dyspnea (MESH:D004417), Cardiopulmonary (MESH:D006323)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12953569/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12953569/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC12953569/full.md

---
Source: https://tomesphere.com/paper/PMC12953569