# Hypoxemia prediction in pediatric patients under general anesthesia using machine learning: A retrospective observational study and external validation

**Authors:** Sujin Baek, Jung-Bin Park, Jihye Heo, Kyungsang Kim, Donghyeon Baek, Chahyun Oh, Hyung-Chul Lee, Dongheon Lee, Boohwi Hong, Vijayalakshmi Kakulapati, Vijayalakshmi Kakulapati, Vijayalakshmi Kakulapati

PMC · DOI: 10.1371/journal.pone.0339276 · PLOS One · 2026-01-08

## TL;DR

This study uses machine learning to predict hypoxemia in children under anesthesia, showing strong performance across two hospitals.

## Contribution

The study externally validates machine learning models for hypoxemia prediction in pediatric anesthesia.

## Key findings

- XGBoost and Transformer models showed strong hypoxemia prediction performance in pediatric patients.
- Transformer achieved the best external validation performance with an AUROC of 0.83.
- Shortening the observation window reduced AUPRC but maintained high AUROC.

## Abstract

Pediatric patients under general anesthesia are particularly vulnerable to hypoxemia, which can lead to rapid oxygen desaturation. This vulnerability necessitates heightened vigilance from anesthesiologists, making pediatric anesthesia management especially challenging. Continuous intraoperative monitoring of oxygenation is critical. However, traditional methods relying solely on SpO2 readings may be insufficient and prone to inaccuracies.

This study aimed to develop and externally validate various machine learning models to predict hypoxemia in pediatric patients under general anesthesia. This retrospective observational study included 800 pediatric cases from Seoul National University Hospital and 134 pediatric cases from Chungnam National University Hospital. Patient data, including vital signs and ventilator parameters sampled every 2 seconds, were analyzed. Four machine learning models (XGBoost, LSTM, InceptionTime, and Transformer) were evaluated using area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), and F1-score.

XGBoost achieved the highest performance in internal validation (AUROC, 0.85), whereas the Transformer model demonstrated the best performance in external validation (AUROC, 0.83). Reducing the observation window from 1 minute to 10 seconds lowered the AUPRC but preserved high AUROC.

The XGBoost and Transformer models demonstrated robust performance in predicting intraoperative hypoxemia in pediatric patients under general anesthesia across two hospitals. Adjustments for age-related variations did not enhance model performance. Future research should focus on developing machine learning models that can accurately distinguish true hypoxemia, leading to clinically significant improvements in patient outcomes.

## Full-text entities

- **Diseases:** Hypoxemia (MESH:D000860)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12782441/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12782441/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/PMC12782441/full.md

---
Source: https://tomesphere.com/paper/PMC12782441