# Assessment of Ten Insulin Resistance Surrogate Indexes Predicts New-Onset Cardiovascular Disease Incidence in Patients with Prediabetes or Diabetes: Insights from CHARLS Data with Machine Learning Analysis

**Authors:** Hang Xie, Chaoying Yan, Yi Zheng, Haoyu Wu

PMC · DOI: 10.5334/gh.1532 · 2026-03-12

## TL;DR

This study finds that two insulin resistance indexes, eGDR and CVAI, best predict new cardiovascular disease in Chinese patients with prediabetes or diabetes when used in machine learning models.

## Contribution

The study identifies eGDR and CVAI as superior IR indexes for CVD prediction in Chinese populations using machine learning.

## Key findings

- eGDR was associated with a 47.3% lower CVD risk in the highest quartile compared to the lowest.
- CVAI was linked to a 33.1% higher CVD risk in the highest quartile.
- KNN models incorporating eGDR and CVAI achieved an AUC of 0.936 for CVD prediction.

## Abstract

Insulin resistance (IR) is a key driver of prediabetes, type 2 diabetes, and cardiovascular disease (CVD) risk. This study evaluated the predictive performance of ten IR surrogate indexes (TyG, TyG-BMI, TyG-WC, TyG-WHtR, METS-IR, AIP, TyHGB, CTI, eGDR, CVAI) for new-onset CVD in Chinese patients with prediabetes or diabetes, aiming to identify the most effective index for cardiovascular risk stratification.

This longitudinal cohort study analyzed 3,532 middle-aged and elderly participants from the China Health and Retirement Longitudinal Study (CHARLS) baseline (Wave 1), with incident CVD events assessed at follow-up (Wave 4). Ten IR surrogate indexes were calculated at baseline. Multivariate logistic regression, adjusted for confounders, assessed associations between these indexes and CVD. Non-linear relationships were explored using restricted cubic spline analyses. Nine machine learning algorithms were employed to develop predictive models, with performance evaluated via receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis.

During follow-up, 874 participants (24.7%) developed CVD. Each standard deviation increase in eGDR was associated with reduced CVD risk (OR = 0.822, 95% CI: 0.696–0.969), while CVAI was linked to increased risk (OR = 1.124, 95% CI: 1.028–1.229). Compared to the lowest quartile, the highest eGDR quartile had a 47.3% lower CVD risk (OR = 0.527, 95% CI: 0.353–0.789, P = 0.0018), and the highest CVAI quartile had a 33.1% higher risk (OR = 1.331, 95% CI: 1.038–1.709, P = 0.0243). Incorporating eGDR and CVAI into machine learning models, particularly K-Nearest Neighbors (KNN), enhanced discrimination (AUC = 0.936, 95% CI: 0.928–0.943).

eGDR and CVAI outperformed other IR indexes in predicting CVD in Chinese patients with prediabetes or diabetes. Their integration into KNN models significantly improved risk stratification, suggesting their utility as accessible clinical tools for early identification and intervention to reduce CVD burden.

## Linked entities

- **Diseases:** cardiovascular disease (MONDO:0004995), prediabetes (MONDO:0006920), type 2 diabetes (MONDO:0005148)

## Full-text entities

- **Genes:** AIP (AHR interacting HSP90 co-chaperone) [NCBI Gene 9049] {aka ARA9, FKBP16, FKBP37, PITA1, SMTPHN, XAP-2}
- **Diseases:** type 2 diabetes (MESH:D003924), IR (MESH:D007333), Prediabetes (MESH:D011236), Diabetes (MESH:D003920), CVD (MESH:D002318)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12985947/full.md

---
Source: https://tomesphere.com/paper/PMC12985947