# A predictive study of glycaemic reversal in Chinese individuals with prediabetes based on machine learning: a 5-year cohort study

**Authors:** Changshun Yan, Su Hu, Hangyu Cao, Rui Xu, Guiqiu Cao, Genshan Ma

PMC · DOI: 10.3389/fendo.2026.1686082 · Frontiers in Endocrinology · 2026-01-28

## TL;DR

This study uses machine learning to predict which Chinese individuals with prediabetes can reverse their condition, identifying key factors like age and BMI.

## Contribution

The novel contribution is the development of an SVM-based predictive model for glycemic reversal in Chinese prediabetic individuals.

## Key findings

- An SVM model outperformed other algorithms with a t-AUC of 0.711 in predicting glycemic reversal.
- Age, FPG, BMI, SBP, DBP, and triglycerides were identified as key factors influencing normoglycemia reversal.
- A reversal rate of 52.6% was observed in the prediabetic cohort over 5 years.

## Abstract

Diabetes mellitus (DM) poses a major global public health challenge. Prediabetes, a critical stage in the progression of DM, represents a pivotal window for intervention and prevention. This study aims to develop and validate a machine learning-based prediction model for glycemic reversal in Chinese individuals with prediabetes, with the goal of facilitating such reversal in this population.

This study analyzed data of Chinese adults from the Dryad database, with a follow-up period from 2010 to 2016. LASSO regression was used to select variables. The selected variables were then used to construct models using random forest, gradient boosting decision tree, eXtreme gradient boosting, Naive Bayes, adaptive boosting, support vector machine (SVM), and Cox model. To assess the discriminative ability of each model, the area under the curve (AUC) was calculated for each. Predictive performance was evaluated by computing time-dependent AUC (t-AUC), accuracy, precision, recall, F1, and C-index. Shapley additive explanations (SHAP) analysis was applied to interpret the key variables identified by the optimal model, and Kaplan-Meier curves for key variables associated with glycemic improvement were plotted to explore differences between groups.

1792 adults with prediabetes were enrolled. During 5 years of follow-up, 942 achieved normoglycemia, yielding a reversal rate of 52.6%. After differential analysis and LASSO regression screening, 12 feature variables were finally determined for model construction. The 3-year, 4-year, and 5-year AUC values for the Cox model all exceeded 0.61. Six machine learning algorithms were employed to construct predictive models. The SVM demonstrated superior overall performance: it yielded a t-AUC of 0.711, accuracy of 0.652, precision of 0.620, recall of 0.661, F1 of 0.639, and a C-Index of 0.709, outperforming the other algorithms. SHAP analysis revealed that age, FPG, BMI, SBP, DBP, and triglycerides are key factors influencing normoglycemia reversal in individuals with prediabetes.

We developed an SVM model to predict glycemic reversal in the prediabetic population in China, and identified key factors influencing glycemic improvement. This work provides a scientific basis for both this population and clinicians to implement early targeted interventions, thereby aiding in reducing the incidence of DM and alleviating the healthcare burden.

## Linked entities

- **Diseases:** Diabetes mellitus (MONDO:0005015), prediabetes (MONDO:0006920)

## Full-text entities

- **Diseases:** DM (MESH:D003920), Prediabetes (MESH:D011236)
- **Chemicals:** triglycerides (MESH:D014280)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12890694/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12890694/full.md

## References

56 references — full list in the complete paper: https://tomesphere.com/paper/PMC12890694/full.md

---
Source: https://tomesphere.com/paper/PMC12890694