# Development and validation of a bedside-available machine learning model to predict discrepancies between SaO₂ and SpO₂: Exploring factors related to the discrepancies

**Authors:** Raito Sato, Naoki Ito, Sakina Kadomatsu, Norikazu Hanioka, Mikio Nakajima, Tadahiro Goto, Mohsen Mehrabi, Mohsen Mehrabi, Mohsen Mehrabi, Mohsen Mehrabi

PMC · DOI: 10.1371/journal.pone.0334350 · PLOS One · 2025-10-21

## TL;DR

This study developed a machine learning model to predict when pulse oximeter readings may be inaccurate in critically ill patients, helping identify hidden hypoxemia.

## Contribution

A bedside-available machine learning model was developed and validated to predict discrepancies between SpO₂ and SaO₂ using non-invasive data.

## Key findings

- The XGBoost model achieved an AUROC of 0.73 in the development dataset and 0.70 after validation.
- Worse vital signs, such as low blood pressure and temperature, were key factors associated with the discrepancy.
- The model was deployed as a web-based application for clinical accessibility.

## Abstract

In critically ill patients, a discrepancy frequently exists between percutaneous oxygen saturation (SpO₂) and arterial blood oxygen saturation (SaO₂), which can lead to potential hypoxemia being overlooked. The aim of this study was to explore the factors related to the discrepancy and to develop an easy-to-use prediction model that uses readily available bedside information to predict the discrepancy and suggest the need for arterial blood gas measurement. This is a prognostic study that used eICU Collaborative Research Database from 2014 to 2015 for model development and MIMIC-IV data from 2008 to 2019 for model validation. To predict the outcome of SpO₂ exceeding SaO₂ by 3% or more, non-invasive, readily available bedside information (patient demographics, vital signs, vasopressor use, ventilator use) was used to develop prediction models with three machine learning methods (decision tree, logistic regression, XGBoost). To make the model accessible, the model was deployed as a web-based application. Additionally, the contribution of each variable was explored using partial dependence plots and SHAP values. From 4,781 admission records in eICU data, a total of 19,804 paired SpO₂ and SaO₂ measurements were used. Among three machine learning models, the XGBoost model demonstrated the best predictive performance with an AUROC of 0.73 and a calibration slope of 0.90. In the validation cohort of MIMIC-IV paired dataset, the performance was AUROC of 0.56. An exploratory model-updating step followed by temporal validation raised performance to AUROC of 0.70 with a calibration slope of 0.85. In both datasets, worse vital signs were associated with the discrepancy (e.g., low blood pressure, low temperature) between SpO₂ and SaO₂. Using non-invasive bedside data, a machine learning model was developed to predict SpO₂–SaO₂ discrepancy and identified vital signs as key contributors. These findings underscore the awareness for hidden hypoxemia and provide the basis of further study to accurately evaluate the actual SaO₂.

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** critically ill (MESH:D016638), hypoxemia (MESH:D000860)
- **Chemicals:** oxygen (MESH:D010100)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** MIMIC-IV — Spodoptera frugiperda (Fall armyworm), Spontaneously immortalized cell line (CVCL_Z366)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12539712/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12539712/full.md

## References

17 references — full list in the complete paper: https://tomesphere.com/paper/PMC12539712/full.md

---
Source: https://tomesphere.com/paper/PMC12539712