# Predicting Metabolic Syndrome Using Supervised Machine Learning: A Multivariate Parameter Approach

**Authors:** Rodolfo Iván Valdez Vega, Jacqueline Alejandra Noboa-Velástegui, Ana Lilia Fletes-Rayas, Iñaki Álvarez, Martha Eloisa Ramos-Marquez, Sandra Luz Ruíz-Quezada, Nora Magdalena Torres-Carrillo, Rosa Elena Navarro-Hernández

PMC · DOI: 10.3390/ijms26209897 · International Journal of Molecular Sciences · 2025-10-11

## TL;DR

This study uses machine learning to predict metabolic syndrome by combining various metabolic and cardiovascular factors, showing high accuracy with random forest and XGBoost models.

## Contribution

The novel contribution is the integration of multiple metabolic and anthropometric variables with machine learning to predict metabolic syndrome effectively.

## Key findings

- Random Forest and XGBoost models achieved AUCs of 0.940 and 0.954, respectively.
- Age, BRI, DAI, HOMA-IR, sdLDL-C, LDL-C, and high-molecular-weight adiponectin were key predictors.
- RF and LR models showed best calibration and highest net benefit in Decision Curve Analysis.

## Abstract

Metabolic syndrome (MetS) is a complex condition characterized by a group of interconnected metabolic abnormalities. Due to its increasing prevalence, better predictive markers are needed. Therefore, this study aims to develop predictive models for MetS by integrating adipokines, metabolic and cardiovascular risk factors, and anthropometric indices. Data were collected from 381 subjects aged 20 to 59 years (242 women and 139 men) from Guadalajara, Jalisco, Mexico, who were classified as having MetS or non-MetS based on the ATP-III criteria. Four supervised machine learning models were developed—Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost)—and their performance was evaluated using the Area under the Curve (AUC), calibration curves, Decision Curve Analysis (DCA), and local interpretability analysis. The RF and XGBoost models achieved the highest AUCs (0.940 and 0.954). The RF and LR models were the best calibrated and showed the highest net benefit in DCA. Key variables included age, anthropometric indices (BRI and DAI), insulin resistance measures (HOMA-IR), lipid profiles (sdLDL-C and LDL-C), and high-molecular-weight adiponectin, used to classify the presence of MetS. The results highlight the usefulness of specific models and the importance of anthropometric variables, cardiovascular risk factors, metabolic profiles, and adiponectin as indicators of MetS.

## Linked entities

- **Diseases:** Metabolic syndrome (MONDO:0000816)

## Full-text entities

- **Genes:** ADIPOQ (adiponectin, C1Q and collagen domain containing) [NCBI Gene 9370] {aka ACDC, ACRP30, ADIPQTL1, ADPN, APM-1, APM1}
- **Diseases:** MetS (MESH:D024821), insulin resistance (MESH:D007333), metabolic abnormalities (MESH:D008659)
- **Chemicals:** lipid (MESH:D008055), LDL-C (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12564442/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12564442/full.md

## References

72 references — full list in the complete paper: https://tomesphere.com/paper/PMC12564442/full.md

---
Source: https://tomesphere.com/paper/PMC12564442