# Non-Glycemic Clinical Data for Type 2 Diabetes Detection in Mexican Adults: A Comparative Analysis of Atherogenic Indices, Statistical Transformations, and Machine Learning Algorithms

**Authors:** Martin Hazael Guerrero-Flores, Valeria Maeda-Gutiérrez, Carlos E. Galván-Tejada, Jorge I. Galván-Tejada, Miguel Cruz, Luis Alberto Flores-Chaires, Karina Trejo-Vázquez, Rafael Magallanes-Quintanar, Javier Saldívar

PMC · DOI: 10.3390/diagnostics16010053 · Diagnostics · 2025-12-23

## TL;DR

This study explores using non-glycemic data like lipid profiles and body measurements to detect Type 2 diabetes in Mexican adults, comparing machine learning models and data transformations.

## Contribution

The study introduces a novel framework for T2D detection using non-glycemic data and evaluates the effectiveness of statistical transformations and machine learning algorithms.

## Key findings

- The AIP index showed the highest discriminatory power among atherogenic indices.
- SVM-RBF and XGBoost models achieved AUC values over 0.90 using transformed non-glycemic data.
- Statistical transformations improved model performance for distribution-sensitive algorithms.

## Abstract

Background: Type 2 diabetes (T2D) is a growing public health problem in Mexico. Lipid profile alterations have been shown to appear years before changes in glycemic biomarkers, and some of the latter are limited in availability, especially in underserved settings. Therefore, anthropometric variables and lipids represent relevant early indicators for the early detection of the disease. This study evaluates the capacity of non-glycemic clinical data—including lipid profile and anthropometric indicators—to detect T2D using machine learning, and compares the performance of different feature engineering approaches. Methods: Using more than a thousand clinical records of Mexican adults, three experiments were developed: (1) a distribution and normality analysis to characterize the variability of lipid variables; (2) an evaluation of the predictive power of multiple atherogenic indices (Castelli I, Castelli II, TG/HDL, and AIP); and (3) the implementation of statistical transformations (logarithmic, quare-root, and Z-standardization) to stabilize variance and improve feature quality. Logistic regression, SVM-RBF, random forest, and XGBoost models were trained on each feature set and evaluated using accuracy, sensitivity, specificity, F1-score, and area under the ROC curve. Results: The AIP index showed the greatest discriminatory power among the atherogenic indices, while normality-based transformations improved the performance of distribution-sensitive models, such as SVM. In the final experiment, the SVM-RBF and XGBoost models achieved AUC values greater than 0.90, demonstrating the feasibility of a diagnostic approach based exclusively on non-glycemic data. Conclusions: The findings indicate that the transformed lipid profile and anthropometric variables can constitute a solid and accessible alternative for the early detection of T2D in clinical and public health contexts, offering a robust methodological framework for future predictive applications in the absence of traditional glycemic biomarkers.

## Linked entities

- **Diseases:** Type 2 diabetes (MONDO:0005148)

## Full-text entities

- **Genes:** AIP (AHR interacting HSP90 co-chaperone) [NCBI Gene 9049] {aka ARA9, FKBP16, FKBP37, PITA1, SMTPHN, XAP-2}
- **Diseases:** Atherogenic (MESH:D050197), T2D (MESH:D003924)
- **Chemicals:** Lipid (MESH:D008055), TG (MESH:D013866)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12786317/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12786317/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12786317/full.md

---
Source: https://tomesphere.com/paper/PMC12786317