# Hypoxemia prediction model based on XGBoost during sedation for gastrointestinal endoscopy

**Authors:** Rong Zhao, Zheng Chen, Qingyu Teng, Tao Xu, Qi Li, Helin Gong, Hongjun Ji, Hui Zhang

PMC · DOI: 10.3389/fmed.2025.1714512 · Frontiers in Medicine · 2026-01-12

## TL;DR

A machine learning model using XGBoost was developed to predict hypoxemia during sedated gastrointestinal endoscopy, improving patient safety and clinical decision-making.

## Contribution

A novel XGBoost-based hypoxemia prediction model with interpretable features and improved accuracy for sedated gastrointestinal endoscopy.

## Key findings

- The XGBoost model achieved an accuracy, recall, and F1-score of 0.91 and an ROC–AUC of 0.74.
- Key features like BMI, waist circumference, and baseline SpO2 were identified as most significant predictors.
- Model performance improved with a balanced dataset of 647 samples, emphasizing the importance of sample size.

## Abstract

Hypoxemia is the most common complication of sedated gastrointestinal endoscopy and can lead to serious consequences. Predicting and preventing hypoxemia remains challenging. Accurate prediction using integrated clinical data and artificial intelligence shows great potential. This study aimed to develop a robust, interpretable, and generalizable Machine Learning (ML) model with acceptable performance for predicting hypoxemia during sedated gastrointestinal endoscopy.

This prospective study included 647 adult patients who underwent sedated gastrointestinal endoscopy at Shanghai Sixth People's Hospital, affiliated with Shanghai Jiao Tong University School of Medicine, between January and May 2025. We employed a combination of statistical and ML techniques, including Pearson correlation analysis, T-test, Chi-square test, Levene test, SHapley Additive exPlanations (SHAP) values, and eXtreme Gradient Boosting (XGBoost) feature importance metrics, for feature selection. Prediction models were developed using XGBoost algorithms, and its performance was evaluated using Accuracy, Precision, Recall, F1-score, and Receiver Operating Characteristic Area Under the Curve (ROC–AUC). After identifying the optimal model, a hypoxemia prediction model was established and validated. We also analyzed the performance of combined features to create innovative features.

The XGBoost model demonstrated the best performance, achieving an accuracy, recall, and F1-score of 0.91 and an ROC–AUC of 0.74 using the selected features. Feature importance analysis identified 29 key features, including 26 traditional features and three innovative features introduced in this study, where Body Mass Index (BMI), waist circumference, neck circumference, age, baseline SpO2 contribute most significantly. Model performance improved when applied to a more balanced dataset of 647 samples, underscoring the importance of sample size in model accuracy.

We present a robust XGBoost-based hypoxemia prediction model that can help clinicians identify at-risk patients during sedated gastrointestinal endoscopy. The model's performance highlights the potential of artificial intelligence to enhance patient safety and clinical decision-making. Future studies should focus on refining the model using larger and more diverse datasets to improve predictive accuracy and clinical applicability. Additionally, methods such as latent-space analysis will be explored to address class imbalance.

## Full-text entities

- **Diseases:** Hypoxemia (MESH:D000860)
- **Chemicals:** XGBoost (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12833028/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12833028/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/PMC12833028/full.md

---
Source: https://tomesphere.com/paper/PMC12833028