First-Day Glycemic Exposure and 28-Day Mortality in the ICU: A Multicenter Cohort Study
Joab O. Odera, Betsabe Blas, Julie Cha, Aisha Montgomery, Alice A. Ojwang, Sepiso Masenga, Elizabeth O. Odera, Nosayaba Osazuwa-Peters, Ananya Yalamanchi, David Han, Antentor O. Hinton

TL;DR
High blood sugar levels in the first day of ICU care are linked to higher 28-day mortality, and a new model helps predict risk early.
Contribution
Introduces GlucoSurvAI, an interpretable model for early ICU mortality risk prediction using first-day glycemic exposure.
Findings
Higher first-day TWAG (≥140 mg/dL) is associated with increased 28-day mortality.
GlucoSurvAI achieved high accuracy (AUROC 0.967) in predicting mortality risk.
Vasopressors and corticosteroids on day 1 are linked to higher mortality.
Abstract
Early glycemic exposure in the ICU is common and clinically modifiable, yet bedside assessment often relies on single glucose values rather than exposure-aware metrics. Interpretable, first-day prediction may support individualized glycemic targets and early intervention. To examine the association between first-day time-weighted average glucose (TWAG) and 28-day mortality, and to evaluate GlucoSurvAI, an interpretable ensemble model for first-day risk stratification. Retrospective cohort study using electornic health records from 13 U.S. hospitals. Among 18,868 adult ICU encounters, 8,048 patients from 7 U.S. hospitals met inclusion criteria (≥1 glucose value and hospital length of stay ≥24 hours). First-day glycemic exposure summarized as TWAG, categorized as <100, 100–139 (reference), 140–179, and ≥180 mg/dL. Prespecified covariates included diabetes/prediabetes, first-day insulin…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHyperglycemia and glycemic control in critically ill and hospitalized patients · Sepsis Diagnosis and Treatment · Diabetes Treatment and Management
Introduction
Hyperglycemia is frequent in critical illness and associated with infection, organ dysfunction, and death. ^1,2^ Landmark evidence (NICE-SUGAR) demonstrated that intensive glucose targets (81–108 mg/dL) increased mortality and hypoglycemia compared with a conventional target ≤ 180 mg/dL, prompting a shift away from tight control in the ICU.^2,3^ Contemporary guidance recommends protocols that minimize hypoglycemia and generally initiate treatment for persistent hyperglycemia (≥ 180 mg/dL), but questions remain about optimal early targets and how best to summarize glycemia during the first ICU day.^3–6^
Most bedside assessments rely on single or sporadic glucose measurements; however, these may misclassify exposure when sampling is intermittent or clinically driven.^7–9^ Exposure-aware metrics, such as time-weighted average glucose (TWAG), integrate all measurements and their timing and may better reflect the early glycemic burden that is most relevant to short-term outcomes.^7,9,10^ Yet the relationship between first-day TWAG and 28-day mortality, and whether this information can be operationalized into accurate, interpretable first-day risk estimation, remains unclear.^7,9,11–13^
We aimed (1) to examine the association between first-day TWAG and 28-day all-cause mortality in a multicenter ICU cohort, and (2) to evaluate an interpretable ensemble model for early risk stratification that aligns model-based explanations with clinically adjusted time-to-event estimates. We hypothesized that higher first-day TWAG, particularly above a clinically relevant upper boundary, would be associated with increased 28-day mortality and that an interpretable model could capture this exposure–mortality gradient with strong discrimination and calibration.
Methods
Data Source
Our cohort was derived from Bridge2AI CHoRUS AIM-AHEAD60 project, a multicenter resource encompassing diverse ICU populations and clinical contexts from 13 U.S. hospitals. ^14,15^
Participants
Eligible patients were aged 18–105 years, had a hospital length of stay ≥ 24 hours, and at least one glucose value recorded during the first ICU day. Among 18,868 unique ICU admissions, 8,048 from 7 of the 13 U.S. hopitals met inclusion criteria after restricting to the index ICU encounter. The participant flow diagram appears in Fig. 1.
Exposures (Glycemic Categories)
Glycemic exposure during the first 24 hours of ICU care was summarized using time-weighted average (TWA) glucose. For clinical interpretability, patients were categorized as hypoglycemic (< 70 mg/dL), euglycemic (70–180 mg/dL), or hyperglycemic (> 180 mg/dL). In analyses examining dose–response, four TWAG bins were evaluated: <100, 100–139 (reference), 140–179, and ≥ 180 mg/dL.
Outcomes
The primary outcome was all-cause mortality by 28 days; mortality by 180 days was a secondary outcome. Survival time was measured from the start of the first 24-hour ICU period. Kaplan–Meier estimates and multivariable time-to-event models were used as described below.
Covariates and Potential Confounders
Prespecified covariates were selected based on clinical relevance and prior literature and included: diabetes status (prediabetes, diabetes), acute severity indicators (e.g., shock, vasopressor score), first-day treatment exposures (insulin, corticosteroids, vasopressors), cancer, and clinical site. To address potential bias from differential glucose monitoring, we adjusted for glucose monitoring intensity and cumulative exposure (e.g., first-day TWAG and number of glucose checks). Variable definitions and coding are detailed in eMethods.
Statistical Analysis
Baseline characteristics are summarized overall and by outcome group (further discussed in eMethods in the Supplement). Survival across glycemic categories was evaluated using Kaplan–Meier curves and log-rank tests (descriptive). The primary inferential analysis used multivariable Cox proportional hazards models to estimate adjusted hazard ratios (aHRs) and 95% CIs for 28-day mortality across glucose categories (reference: 100–139 mg/dL). Models adjusted for the covariates described above, including clinical site, to mitigate confounding due to site-level practice variation; p-values are not emphasized in the narrative, consistent with our focus on aHRs and 95% CIs. Model assumptions and sensitivity analyses are described in eMethods in the Supplement.
To further contextualize glycemic exposure, we calculated restricted mean survival time (RMST) over a fixed 28-day horizon and compared RMST across the prespecified glucose bins (< 100 mg/dL, 100–139 mg/dL, 140–179 mg/dL, and ≥ 180 mg/dL) and across demographic strata (< 50, 50–64, 65–79, and ≥ 80 years; race; and ethnicity). RMST provides an interpretable, time-bounded measure of average survival and is robust when proportional hazards may not fully hold.^16–18^
Prediction Models
For context and benchmarking, we compared traditional severity-score models (SOFA and APACHE II, fit as logistic regressions for 28-day mortality) with a stacked ensemble model (GlucoSurvAI) trained on first-day features. GlucoSurvAI integrates a survival-oriented neural network with gradient-boosted trees and a final meta-learner to yield calibrated short-term mortality predictions. Development procedures (preprocessing, cross-validation, hyperparameter tuning, calibration) and interpretability methods are provided in eMethods in the Supplement.
Performance Assessment
Discrimination for binary 28-day mortality was summarized by AUROC, probability calibration by the Brier score. For time-to-event performance we report Harrell’s C-index. Threshold selection and classification metrics are described in eMethods and are not central to clinical interpretation of model comparisons in the main text.
Software
Analyses were performed using Python-based tooling; package versions and code provenance are provided in eMethods.
Reporting and Supplementary Materials
This manuscript adheres to STROBE and TRIPOD + AI recommendations; extended methods, definitions, and robustness checks appear in the eMethods/eTables/eFigures in the Supplement. JAMA Network encourages adherence to EQUATOR reporting guidance and use of the Supplement for technical detail.
Ethics
Only de-identified data were used and institutional approvals aligned with participating sites’ data-use agreements. As part of Bridge2AI, this study was approved by each institution’s Institutional Review Board (IRB), and implied consent was applied because the dataset consisted of limited, de-identified records collected from participating sites and subsequently date-shifted at a central repository.
Results
Cohort selection and first-day glycemic distribution
Among 18,868 ICU admissions, 8,048 unique patients met inclusion criteria (≥ 1 glucose value and hospital length of stay ≥ 24 hours; Fig. 1). By clinical glycemic categories, most patients were euglycemic (70–180 mg/dL) on day 1, with smaller proportions classified as hypoglycemic (< 70 mg/dL) or hyperglycemic (> 180 mg/dL) (distribution in Table 1). When considering all measurements within the initial 24-hour window, 2,056 patients (25.5%) experienced at least one hyperglycemic episode, whereas 185 (2.3%) had at least one hypoglycemic episode; 7,068 patients (87.8%) remained entirely within the target range (eTable 1 in the Supplement). Of the 8,048 included patients, 158 lacked complete timestamp data for their initial glucose measurements, which prevented calculation of day-one TWAG; these patients were retained in categorical analyses and subsequently included in multivariable models using multiple imputation by chained equations (MICE) for missing time-weighted glucose values, as detailed in the eMethods in the Supplement.
Baseline monitoring and site variation
Baseline monitoring and site differences are summarized in eTable 2 in the Supplement. Mortality differed across sites and was partly explained by case-mix and acuity: centers with higher SOFA/APACHE II burdens also had higher crude mortality. In adjusted models, we included clinical site to mitigate confounding by practice patterns and patient mix; core associations between TWAG bins and mortality were unchanged. Detailed site-level summaries and robustness checks are provided in eResults in the Supplement.
Patient mortality by clinical glycemic category
Kaplan–Meier curves showed significant differences in mortality across clinical glycemic categories (hypoglycemic < 70 mg/dL, euglycemic 70–180 mg/dL, hyperglycemic > 180 mg/dL), with higher mortality in hyperglycemia at both 28 days and 6 months (Fig. 2). In Cox models adjusted for HbA1c category and first-day insulin/glucose administration, hyperglycemia remained associated with increased mortality vs euglycemia (28-day aHR 1.91, 95% CI 1.35–2.70; 6-month aHR 1.63, 95% CI 1.18–2.25), while hypoglycemia was not significant after adjustment. Additional treatment and HbA1c associations are described in eResults in the Supplement.
GlucoSurvAI shows superior predictive performance when compared to AI/ML models
Figure 3B summarizes the performance of seven machine learning and deep learning models predicting 28-day mortality, evaluated across multiple metrics including AUROC, accuracy, sensitivity, specificity, precision, recall, F1 score, and Brier score.
GlucoSurvAI achieved the highest raw AUROC (0.967 ± 0.008) in ROC curve analysis (Fig. 3A), maintained strong accuracy (90%), balanced sensitivity (89.9%) and specificity (90.7%), and a high F1 score (94.3%), and a low Brier score (0.026). A schematic of GlucoSurvAI’s architecture is provided in eFigure 1 in the Supplement.
Additionally, site-level performance metrics showed that GlucoSurvAI AUROC values consistently exceeding 0.96 across major sites (Site 13: 0.988; Site 2: 0.980; Site 4: 0.964; Site 3: 0.963; Site 12: 0.975), coupled with high sensitivity (> 0.93) and strong negative predictive values (> 0.89) (eTable 3 in the Supplement).
TWAG exposure bins: SHAP and adjusted hazards
Model explanations from GlucoSurvAI, focused on 28-day survival and adjusted for diabetes/prediabetes, first-day insulin, glucose, corticosteroids, and vasopressors, shock, cancer, monitoring intensity, and clinical site indicated higher 28-day survival for 100–139 mg/dL, somewhat higher survival for < 100 mg/dL, and lower predicted survival for exposures ≥ 140 mg/dL (Fig. 4). These adjusted 28-day SHAP patterns aligned with adjusted 28-day GlucoSurvAI Cox models (Fig. 5, eTable 4A in the Supplement): relative to 100–139 mg/dL, mortality risk was higher for 140–179 mg/dL (aHR 1.42, 95% CI 1.25–1.62) and ≥ 180 mg/dL (aHR 1.41, 95% CI 1.17–1.69).
Beyond glycemic strata, several physiologic and clinical variables were independently associated with mortality after adjustment for confounders and for the reference glucose range (100–139 mg/dL). Among comorbidities, hypertension (aHR 0.76, 95% CI: 0.68–0.85) and COPD (aHR 0.84, 95% CI: 0.71–0.98) were associated with decreased mortality, whereas sepsis (aHR 1.28, 95% CI: 1.13–1.45) and brain injury (aHR 1.53, 95% CI: 1.22–1.93) markedly increased risk. Interestingly, psychological disorders (aHR 0.75, 95% CI: 0.67–0.84) and viral disease (aHR 0.69, 95% CI: 0.50–0.96) were associated with lower hazards, while organ failure involving heart, kidney, or liver showed modest decreased hazards (aHR 0.89, 95% CI: 0.79–0.99); which may indicate complex interactions with critical illness trajectories (eTable 4(A) and eFigure 2 in the Supplement).
When stratified by HbA1c category, risk patterns varied across glucose bins. In the ≥ 180 mg/dL group, patients with diabetes showed lower hazards compared with Normal HbA1c, whereas those with prediabetes exhibited increased risk. Diabetic patients with an end-of-day TWAG between 140–179 mg/dL had significantly decreased risk of mortality, while combinations within < 100 mg/dL and 100–139 mg/dL bins were largely neutral (eFigure 3 in the Supplement). Details on the effect of confounders may be found in eResults and eTable 4B in the Supplement.
GlucoSurvAI Stratified Analysis of Glycemic Exposure by Demographic groups reveals at risk populations
Mortality was lowest in the 100–139 mg/dL range across all strata, with restricted mean survival time (RMST) approximating 27 days. Event rates increased sharply in the 140–179 mg/dL and ≥ 180 mg/dL bins, particularly among patients aged ≥ 80 years, where mortality exceeded 16% and RMST declined to ~ 24–25 days. Younger critically ill patients (< 50 years) also exhibited elevated risk at higher glucose levels, with event rates surpassing 25% in the 140–179 mg/dL and ≥ 180 mg/dL categories.
Across racial groups, White patients consistently demonstrated lower event rates and longer RMST within the optimal glycemic range, whereas Black and Other/Unknown race categories showed higher mortality in hyperglycemic strata. Ethnicity trends were less pronounced, but Hispanic patients in older age groups (≥ 65 years) exhibited increased mortality at glucose levels ≥ 140 mg/dL compared to non-Hispanic counterparts. These patterns were mirrored by higher Cox hazard scores and GlucoSurvAI-predicted mortality probabilities (eTable 5 in the Supplement).
We expanded our analyses to include major ICU admission reasons: injury/trauma, brain injury, neurological conditions, viral and bacterial infections, surgery, organ failure, sepsis, and psychological conditions, stratified by glycemic exposure and age (eTable 6 in the Supplement). Across all strata, sepsis and organ failure consistently exhibited the highest mortality, particularly in hyperglycemic patients aged ≥ 80 years, where event rates exceeded 18–22% and RMST declined to ~ 23–25 days. Elevated risk was also observed in younger patients (< 50 years) admitted for sepsis, with event rates approaching 19–23% in the 140–179 mg/dL and ≥ 180 mg/dL bins.
Conversely, patients admitted for psychological or trauma-related indications had lower mortality in the optimal glycemic range (100–139 mg/dL), with RMST near 27 days. Mortality risk increased substantially under hyperglycemia, with event rates rising to 7–11% in older adults, and up to 33% in younger trauma patients at ≥ 140 mg/dL. Neurologic and brain injury cases showed intermediate risk patterns, with mortality climbing sharply in the highest glucose strata, especially among the oldest patients. (eTable 6).
Discussion
In this multicenter ICU cohort, we developed GlucoSurvAI, an interpretable, stacked survival model that integrates deep learning and gradient boosting to characterize first-day glycemia and predict 28-day mortality using early glycemic and physiologic data. Elevated first-day glucose exposure was associated with higher mortality, underscoring the importance of quantifying per-patient early glycemia rather than focusing solely on hypoglycemia avoidance. These findings align with evolving guidance that favors moderate glycemic targets and emphasizes minimizing both hyperglycemia and hypoglycemia in critical care.^4,12^
Most patients began ICU care within the euglycemic category by conventional clinical cut points, yet hyperglycemic excursions were common over the first 24 hours. We found that time-weighted average glucose (TWAG) offers a more comprehensive summary of cumulative exposure than single-point or short-window metrics, which can misestimate true burden when sampling is intermittent or clustered in time.^3,7,8^ By integrating every measurement and its duration, TWAG reduced misclassification evident with shorter windows, a pattern consistent with emerging reports that exposure-based metrics better reflect risk.^7,9^
Using prespecified TWAG exposure bins, both adjusted Cox models and adjusted SHAP explanations from GlucoSurvAI converged on 100–139 mg/dL as the range associated with higher 28-day survival, with higher risk beginning at 140 mg/dL. Relative to the 100–139 mg/dL reference, the 140–179 mg/dL and ≥ 180 mg/dL bins had higher adjusted hazards for 28-day mortality; <100 mg/dL trended toward higher survival but with wider uncertainty due to smaller numbers. The close alignment between adjusted SHAP (model-based explanations) and adjusted Cox (time-to-event estimates) increases confidence that this ~ 140 mg/dL inflection is a clinically meaningful signal in the initial 24 hours of an ICU admission.^7,19,20^
Interpreting glycemia in context is essential. In our adjusted models, first-day vasopressors and corticosteroids were associated with higher mortality, plausibly reflecting illness severity and treatment intensity, while insulin exposure also marked higher risk after adjustment. These associations highlight the interplay between hemodynamic support, pharmacologic therapy, and metabolic control, and reinforce that early glycemic exposure should be considered alongside acuity and therapeutic milieu when formulating risk-informed targets.^21–24^
Traditional ICU mortality prediction scores, such as SOFA and APACHE II, are well validated and have been widely adopted for decades.^25,26^ Moreover, both SOFA and APACHE scores provide a robust framework for assessing illness severity and evaluating organ function in critically ill patients.^26,27^ However, they were originally developed using older, often single-center cohorts, which may limit their ability to fully reflect contemporary patient complexity and evolving treatment practices.^25,26^ In this study, we leveraged a modern, multicenter ICU dataset to benchmark these established scores against advanced AI/ML models and novel glycemic indices derived from time-weighted glucose exposure. Our findings show that while APACHE II and SOFA maintain respectable predictive performance (AUROC = 0.749 and 0.754, respectively), they are substantially outperformed by GlucoSurvAI (AUROC = 0.976), which integrates dynamic glycemic metrics with physiological and comorbidity features. Additionally, compared with other AI/ML models, GlucoSurvAI avoids extreme trade-offs between precision and recall and offers practical advantages in speed, interpretability, and ease of deployment. These attributes make it a robust and well-rounded choice for survival prediction in this study.^28–30^
Beyond point estimates of accuracy, GlucoSurvAI preserved interpretability via SHAP, allowing clinicians to visualize how exposure bins and treatment covariates contribute to predicted risk at the bedside. Because the model relies on first-day information, it offers a practical path for early risk stratification that complements, and does not replace, established scores and clinical judgment.^30,31^
Stratified summaries suggested that the glycemia–survival gradient is context-dependent: it appeared steeper at higher exposure among older adults and in sepsis/organ-failure admissions, whereas some lower-acuity indications showed flatter gradients until exposure was markedly elevated. These patterns, together with the adjusted bin findings, support risk-informed targets rather than universal thresholds, consistent with literature showing heterogeneous glycemic risk across diagnoses and chronic glycemic status.^8,32^
In our adjusted analyses, patients with established diabetes (higher HbA1c) showed attenuated mortality risk in the upper TWAG bins, whereas those with prediabetes/normal HbA1c had higher risk at the same exposures. This pattern is consistent with prior work that indicate that acute stress hyperglycemia is more prognostic in patients without chronic hyperglycemia, while individuals with long-standing hyperglycemia may exhibit partial physiologic adaptation.^33–36^ Taken together, our results support individualized early glucose targets that account for baseline glycemic status (HbA1c) when interpreting first-day exposure.^8,37^
The present results reinforce a growing body of work linking higher early glycemic exposure and lower time-in-range with worse short-term outcomes in ICU populations.^7,9^ They also harmonize with the post–NICE-SUGAR shift away from intensive 80–110 mg/dL targets toward moderate ranges that reduce hypoglycemia risk while acknowledging the harm of sustained hyperglycemia.^4,38^ Importantly, our analysis quantifies an exposure-oriented signal during the first 24 hours and demonstrates how combining TWAG bins with an interpretable ensemble model can make this signal clinically actionable in real-time.
Strengths of our study include (1) a diverse, multisite cohort; (2) emphasis on exposure-aware metrics that reflect cumulative glycemia; (3) convergence between adjusted Cox and adjusted SHAP results; and (4) rigorous internal validation with discrimination and calibration reporting. Limitations include the retrospective design; variation in measurement frequency and modalities across sites; and the absence of standardized insulin/nutrition protocols, each of which may influence both exposure and outcomes. Although we adjusted for glucose monitoring intensity and included clinical sites to mitigate case-mix and practice differences, residual confounding remains possible. External validation and prospective evaluation, especially of model-informed glycemic management, are needed before clinical deployment (please see eAppendix).^12,39^
Collectively, our findings support maintaining early exposure in the 100–139 mg/dL range and avoiding sustained exposure ≥ 140 mg/dL during the first ICU day, while recognizing patient-specific factors, competing risks, and the real-world challenges of insulin titration. Pairing exposure-based metrics with interpretable model probabilities may help clinicians identify who is most likely to benefit from early titration and when the risk–benefit balance favors intervention, particularly in older adults and in high-acuity indications such as sepsis and organ failure.^7,9,37^
In conclusion, first-day glycemic exposure is common, measurable, and actionable. TWAG offers a robust summary of early glycemia, and both adjusted Cox and adjusted SHAP indicate 100–139 mg/dL is associated with higher 28-day survival, with higher risk beginning at 140 mg/dL. GlucoSurvAI serves as an interpretable ensemble model that leverages these exposure-aware metrics enables early, individualized risk stratification and provides a foundation for prospective, risk-informed glycemic management in critical care.^12,37^
Supplementary Material
This is a list of supplementary fi les associated with this preprint. Click to download.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Liu J, Wu J, Liu S, Li M, Hu K, Li K (2021) Predicting mortality of patients with acute kidney injury in the ICU using XG Boost model. P Lo S ONE 16(2):e 0246306. 10.1371/journal.pone.024630633539390 PMC 7861386 · doi ↗ · pubmed ↗
- 2Investigators N-SS, Finfer S, Liu B (2012) Hypoglycemia and risk of death in critically ill patients. N Engl J Med Sep 20(12):1108–1118. 10.1056/NEJ Moa 1204942 · doi ↗
- 3American Diabetes Association Professional Practice C (2024) ;47(Suppl 1):S 111–S 125. 10.2337/dc 24-S 006 · doi ↗
- 4Finfer S, Wernerman J, Preiser JC (2013) Clinical review: Consensus recommendations on measurement of blood glucose and reporting glycemic control in critically ill adults. Crit Care Jun 14(3):229. 10.1186/cc 12537 · doi ↗
- 5Krinsley JS, Maurer P, Holewinski S (2017) Glucose Control, Diabetes Status, and Mortality in Critically Ill Patients: The Continuum From Intensive Care Unit Admission to Hospital Discharge. Mayo Clin Proc. Jul ;92(7):1019–1029. 10.1016/j.mayocp.2017.04.01528645517 · doi ↗ · pubmed ↗
- 6Marik PE, Bellomo R (2013) Stress hyperglycemia: an essential survival response! Crit Care Mar 6(2):305. 10.1186/cc 12514 · doi ↗
- 7Feng M, Zhou J (2024) Relationship between time-weighted average glucose and mortality in critically ill patients: a retrospective analysis of the MIMIC-IV database. Sci Rep Feb 27(1):4721. 10.1038/s 41598-024-55504-9 · doi ↗
- 8Fong KM, Au SY, Ng GWY (2022) Glycemic control in critically ill patients with or without diabetes. BMC Anesthesiol Jul 16(1):227. 10.1186/s 12871-022-01769-4 · doi ↗
