# Integrating Feature Selection, Machine Learning, and SHAP Explainability to Predict Severe Acute Pancreatitis

**Authors:** İzzet Ustaalioğlu, Rohat Ak

PMC · DOI: 10.3390/diagnostics15192473 · 2025-09-27

## TL;DR

This study uses machine learning and explainability tools to predict severe acute pancreatitis early in emergency department patients.

## Contribution

A novel integration of feature selection, machine learning, and SHAP explainability for early SAP prediction at ED presentation.

## Key findings

- The top-performing model achieved an AUROC of 0.826 using RFE–RF features and kNN.
- Random-forest-based pipelines showed favorable calibration for SAP prediction.
- SHAP analysis confirmed clinically plausible contributions from routinely available variables.

## Abstract

Background/Objectives: Severe acute pancreatitis (SAP) carries substantial morbidity and resource burden, and early risk stratification remains challenging with conventional scores that require serial observations. The aim of this study was to develop and compare supervised machine-learning (ML) pipelines—integrating feature selection and SHAP-based explainability—for early prediction of SAP at emergency department (ED) presentation. Methods: This retrospective, single-center cohort was conducted in a tertiary-care ED between 1 January 2022 and 1 January 2025. Adult patients with acute pancreatitis were identified from electronic records; SAP was classified per the Revised Atlanta criteria (persistent organ failure ≥ 48 h). Six feature-selection methods (univariate AUROC filter, RFE, mRMR, LASSO, elastic net, Boruta) were paired with six classifiers (kNN, elastic-net logistic regression, MARS, random forest, SVM-RBF, XGBoost) to yield 36 pipelines. Discrimination, calibration, and error metrics were estimated with bootstrapping; SHAP was used for model interpretability. Results: Of 743 patients (non-SAP 676; SAP 67), SAP prevalence was 9.0%. Compared with non-SAP, SAP patients more often had hypertension (38.8% vs. 27.1%) and malignancy (19.4% vs. 7.2%); they presented with lower GCS, higher heart and respiratory rates, lower systolic blood pressure, and more frequent peripancreatic fluid (31.3% vs. 16.9%) and pleural effusion (43.3% vs. 17.5%). Albumin was lower by 4.18 g/L, with broader renal–electrolyte and inflammatory derangements. Across the best-performing models, AUROC spanned 0.750–0.826; the top pipeline (RFE–RF features + kNN) reached 0.826, while random-forest-based pipelines showed favorable calibration. SHAP confirmed clinically plausible contributions from routinely available variables. Conclusions: In this study, integrating feature selection with ML produced accurate and interpretable early prediction of SAP using data available at ED arrival. The approach highlights actionable predictors and may support earlier triage and resource allocation; external validation is warranted.

## Linked entities

- **Diseases:** acute pancreatitis (MONDO:0006515), malignancy (MONDO:0004992)

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}, ALB (albumin) [NCBI Gene 213] {aka FDAHT, HSA, PRO0883, PRO0903, PRO1341}
- **Diseases:** pleural effusion (MESH:D010996), organ failure (MESH:D009102), SAP (MESH:D045169), inflammatory (MESH:D007249), malignancy (MESH:D009369), hypertension (MESH:D006973), acute pancreatitis (MESH:D010195)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12523390/full.md

---
Source: https://tomesphere.com/paper/PMC12523390