# Assessment of Machine Learning Model Performance for Clinical Prediction of Insulin Resistance in the Study of Cardiovascular Risk in Adolescents—ERICA

**Authors:** Jéssica Aparecida Silva, Katia Vergetti Bloch, Moyses Szklo, Rodolfo Deusdará

PMC · DOI: 10.3390/jcm15062224 · Journal of Clinical Medicine · 2026-03-15

## TL;DR

This study compares machine learning models to predict insulin resistance in Brazilian adolescents, finding that logistic regression performs best and identifies key predictors like waist circumference and triglycerides.

## Contribution

The study evaluates and compares multiple machine learning models for predicting insulin resistance in adolescents using a large Brazilian dataset.

## Key findings

- Logistic Regression had the best AUC (0.8) for predicting insulin resistance in both boys and girls.
- Calibration was better in girls than in boys for the top-performing models.
- Waist circumference, triglycerides, and age were the most important predictors for both sexes.

## Abstract

Background: Insulin resistance is defined as reduced tissue responsiveness to insulin-mediated glucose actions. Gold standard methods like hyperinsulinemic-euglycemic clamp and hyperglycemic clamps are costly and rarely used in large epidemiological studies. The aim was to evaluate the best performing machine learning algorithm for insulin resistance predictions in Brazilian adolescents. Methods: We used data from 37,454 Brazilian adolescents from 12 to 17 years, sampled from the Study of Cardiovascular Risk Factors in Adolescents (2013–2014). Covariates included other cardiovascular risk factors. We evaluate seven machine learning models stratifying the subset by sex. The performance of the models was assessed by area under the curve (AUC), calibration curves and decision curve analysis (DCA). Finally, we adopted the SHAP approach to assess the importance of each variable to the best performing ML model. Results: The Logistic Regression model presented the best AUC value (AUC = 0.8 for boys and girls). The best performing ML models had higher calibration in girls than in boys. The DCA curves showed prevalence of almost equal values for girls and for boys. The most important clinical predictors for both sexes were waist circumference, triglycerides and age. Conclusions: Logistic Regression proved to be the best clinical prediction model comparable to complex models. Further studies are needed in more diverse populations.

## Linked entities

- **Species:** Homo sapiens (taxon 9606)

## Full-text entities

- **Genes:** INS (insulin) [NCBI Gene 3630] {aka IDDM, IDDM1, IDDM2, ILPR, IRDN, MODY10}
- **Diseases:** hyperinsulinemic-euglycemic (MESH:D044903), hyperglycemic (MESH:D006944), Insulin Resistance (MESH:D007333)
- **Chemicals:** glucose (MESH:D005947), triglycerides (MESH:D014280)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13026674/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13026674/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/PMC13026674/full.md

---
Source: https://tomesphere.com/paper/PMC13026674