# Development and validation of a practical prediction model for post-ERCP pancreatitis using machine learning

**Authors:** Tianyu De, Guohui Du, Hongkun Yin, Hao Wang, Wei Wang, Tian Ma, Junbai Ma, Hao Wang, Qi Wang

PMC · DOI: 10.3389/fsurg.2025.1628956 · Frontiers in Surgery · 2025-11-03

## TL;DR

This paper creates a machine learning model to predict the risk of post-ERCP pancreatitis, helping doctors identify high-risk patients and improve prevention strategies.

## Contribution

A novel machine learning-based prediction model and simplified scoring system for post-ERCP pancreatitis risk using clinical features and LightGBM.

## Key findings

- XGBoost, SVM, LightGBM, and MLP models outperformed logistic regression in predicting post-ERCP pancreatitis risk.
- A simplified scoring system based on the LightGBM model achieved an AUC of 0.75.
- Clinical features like pancreatic stent placement and age were identified as significant predictors.

## Abstract

Post-endoscopic retrograde cholangiopancreatography (ERCP) pancreatitis (PEP) is one most frequent and severe complication of ERCP. In consideration of recent advancements in both endoscopic and artificial intelligence research, it is possible to construct a practical risk prediction model to facilitate the identification of PEP patients at elevated risk.

We developed and validated a concise predictive model for post-ERCP pancreatitis risk with logistic regression (LR), LightGBM, Support Vector Machine (SVM), XGBoost, and Multilayer Perceptron (MLP) neural network models.

We selected 688 patients undergone ERCP to form the basic dataset, with 70% for training and 30% for validation. Subsequently, Stepwise Backward Selection Based on Logistic Regression was utilized to select pertinent clinical features, incorporating the machine learning (ML) models to construct the final predictive model. The efficacy of the model was evaluated by various metrics. These newly identified clinical features were then incorporated into a simplified, points-based risk scoring system for potential bedside application and further evaluation.

Based on the collected data and the results of stepwise backward regression, we identified the following features as potentially significant clinical variables that influence the risk of post-ERCP pancreatitis: periampullary diverticulum, pancreatic stent placement, pancreatic guidewire passages, dilation of the extrahepatic bile duct, age, and coronary artery disease, and constructed a prediction model. Following this, several ML models were constructed to assess the performance of this model. All ML models demonstrated superior performance to conventional logistic regression (LR) models in terms of AUC curves, with XGBoost, SVM, LightGBM, and MLP models all achieving at least acceptable performance levels. Finally, we developed a simplified scoring system based on LightGBM model with an AUC of 0.75.

We developed and validated a concise predictive model for post-ERCP pancreatitis risk, and a simplified scoring system based on the LightGBM model. This model facilitates individual risk prediction and preventive strategy selection.

## Full-text entities

- **Diseases:** dilation of the extrahepatic bile duct (MESH:D001651), pancreatitis (MESH:D010195), periampullary diverticulum (MESH:D004240), -ERCP (MESH:D012183), coronary artery disease (MESH:D003324)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12620498/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12620498/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC12620498/full.md

---
Source: https://tomesphere.com/paper/PMC12620498