# Enhancing survival prediction for COVID-19 in diabetic patients in Mexico: integrating RMST, propensity score matching, and ensemble machine learning

**Authors:** Mariano Vargas-Santiago, Diana A. León-Velasco, Raúl Monroy, Sergio Quezada-García

PMC · DOI: 10.3389/fendo.2025.1725251 · Frontiers in Endocrinology · 2026-01-12

## TL;DR

This study combines survival analysis and machine learning to predict survival outcomes for diabetic and non-diabetic hospitalized COVID-19 patients in Mexico.

## Contribution

The novel approach integrates RMST, propensity score matching, and ensemble machine learning to improve survival prediction and interpretability in diabetic patients.

## Key findings

- Diabetic patients had a lower RMST compared to non-diabetic patients, with a 2.32-day difference.
- Machine learning models achieved strong internal validity (R2 > 0.60) in predicting survival outcomes.
- SHAP analysis identified obesity, smoking, and hypertension as key predictors of survival.

## Abstract

This study evaluates the survival impact of diabetes on hospitalized COVID-19 patients in Mexico by combining traditional survival methods (Restricted Mean Survival Time, RMST) with machine learning (ML) prediction. The goal is to understand how diabetes and associated comorbidities affect short-term survival and to develop accurate, interpretable models that support data-driven decision-making.

A national dataset of over one million COVID-19 cases was analyzed. Diabetic and non-diabetic cohorts were matched using propensity scores based on key covariates (e.g., age, gender, and comorbidities). RMST differences were estimated using survival curves and statistical testing. Separately, machine learning models (Random Forest (RF) and Variational Deep Neural Network (VDNN)) were trained to predict individual RMST values, and SHapley Additive exPlanations (SHAP) were used for model interpretability.

The RMST for diabetic patients was lower than that for non-diabetic patients, with a difference of 2.32 days (p = 0.0583) after matching. Predictive models achieved strong internal validity (R2 > 0.60). SHAP analysis revealed obesity, smoking, and hypertension as the top predictors and suggested that temporal variables and comorbidities played a central role in short-term survival.

Combining survival analysis with machine learning provides both inferential and predictive insights into the mortality risk of diabetic COVID-19 patients. More importantly, results show that traditional survival analyzes with modern machine learning yields accurate and interpretable predictions that can support personalized interventions tailored to patients with COVID-19 and comorbid diabetes: such as prioritizing early clinical monitoring, individualized treatment plans, or risk-informed hospital admission decisions, and guide a more efficient allocation of healthcare resources.

## Linked entities

- **Diseases:** diabetes (MONDO:0005015), COVID-19 (MONDO:0100096), obesity (MONDO:0011122)

## Full-text entities

- **Diseases:** Diabetic (MESH:D003920), COVID-19 (MESH:D000086382), obesity (MESH:D009765), hypertension (MESH:D006973)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12832334/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12832334/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC12832334/full.md

---
Source: https://tomesphere.com/paper/PMC12832334