# Unveiling the Gaps: Machine Learning Models for Unmeasured Ions

**Authors:** Furkan Tontu, Zafer Çukurova

PMC · DOI: 10.3390/diagnostics16030427 · Diagnostics · 2026-02-01

## TL;DR

This study compares different methods to estimate unmeasured ions in critically ill patients and finds that the base excess gap (BEGap) is the most effective parameter.

## Contribution

The study introduces BEGap as a superior bedside parameter for estimating unmeasured ions compared to existing methods like AGc and SIG.

## Key findings

- BEGap outperformed AGc and SIG in predicting arterial pH across multiple models.
- Machine learning models like XGBoost provided stable and accurate results for BEGap.
- BEGap is a practical and physiologically relevant bedside parameter for ICU use.

## Abstract

Background: Unmeasured ions (UIs) contribute significantly to acid–base disturbances in critically ill patients, yet the optimal parameter for their estimation remains uncertain. The most widely used indicators are the albumin-corrected anion gap (AGc), the strong ion gap (SIG), and the base excess gap (BEGap). Methods: In this retrospective cohort study, a total of 2274 ICU patients (2018–2022) were included in the development cohort, and an independent external validation cohort of 1202 patients (2023–2025) was used to assess temporal generalizability. Three approaches to blood gas analysis—traditional (PaCO2, HCO3−, AGc), Stewart (PaCO2, SIDa, ATOT, SIG), and partitioned base excess (PaCO2, BECl, BEAlb, BELac, BEGap)—were evaluated. Multivariable linear regression (MLR) and machine learning (ML, random forest [RF], extreme gradient boosting [XGBoost], and support vector regression [SVR]) were applied to evaluate the explanatory performance of analytical approaches with respect to arterial pH. Model performance was assessed using adjusted R2, RMSE, and MAE. Variable importance was quantified with tree-based methods, SHAP values, and permutation importance. All modeling and reporting steps followed the PROBAST-AI guideline. Results: In multiple linear regression (MLR), the partitioned base excess (BE) approach achieved the highest explanatory performance (adjusted R2 = 0.949), followed by the traditional (0.929) and Stewart approaches (0.926). In ML analyses, model fit was high across all approaches. For the traditional approach, R2 values were 0.979 with RF, 0.974 with XGBoost, and 0.934 with SVR. The Stewart’s approach showed lower overall explanatory performance, with R2 values of 0.876 (RF), 0.967 (XGBoost), and 0.996 (SVR). The partitioned BE approach again demonstrated the strongest explanatory performance, achieving R2 values of 0.975 with XGBoost and 0.989 with SVR. Across all analytical models, BEGap consistently emerged as a strong and independent determinant of arterial pH, outperforming SIG and AGc. SIG showed an intermediate contribution, whereas AGc provided minimal independent explanatory value. Among ML models, XGBoost showed the most stable and accurate explanatory performance across approaches. Conclusions: This study demonstrates that BEGap is a practical, physiologically informative, and bedside-applicable parameter for assessing unmeasured ions, outperforming both AGc and SIG across linear and non-linear analytical models.

## Full-text entities

- **Genes:** ALB (albumin) [NCBI Gene 213] {aka FDAHT, HSA, PRO0883, PRO0903, PRO1341}, SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** critically ill (MESH:D016638), acid-base disturbances (MESH:D000137)
- **Chemicals:** HCO3 (MESH:D001639)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12897467/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12897467/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC12897467/full.md

---
Source: https://tomesphere.com/paper/PMC12897467