# A Simplified Machine Learning Model for Predicting Reduced Kidney Function in Thai Patients with Type 2 Diabetes: A Retrospective Study

**Authors:** Wanjak Pongsittisak, Swangjit Suraamornkul

PMC · DOI: 10.3390/jcm14134735 · 2025-07-04

## TL;DR

This study developed simple machine learning models to predict kidney function decline in Thai patients with type 2 diabetes using routine clinical data.

## Contribution

The novelty lies in creating effective, interpretable ML models using minimal data for early CKD detection in resource-limited settings.

## Key findings

- XGBoost achieved strong predictive performance with an AUROC of 0.824 for the with-HbA1c model.
- Age, HbA1c, and systolic blood pressure were identified as the most influential predictors of reduced kidney function.
- The non-HbA1c model performed comparably with an AUROC of 0.819, showing robustness without HbA1c data.

## Abstract

Background: Chronic kidney disease (CKD) is a prevalent complication among individuals with type 2 diabetes (T2D), posing significant diagnostic challenges in resource-limited settings due to infrequent testing and missed hospital visits. This study aimed to develop a simple, effective ML model to identify T2D patients at high risk for reduced kidney function. Methods: We retrospectively analyzed data from 3471 T2D patients collected over a ten-year period at a university hospital in Bangkok, Thailand. Two models were developed using readily available clinical features: one including hemoglobin A1c (HbA1c) levels (the “with-HbA1c” model) and one excluding HbA1c levels (the “non–HbA1c” model). Three tree-based ML algorithms—decision tree, random forest, and extreme gradient boosting (XGBoost) algorithms—were employed. The outcome label was CKD, defined as an estimated Glomerular Filtration Rate (eGFR) < 60 mL/min/1.73 m2 that persisted for more than 90 days. The model performance was evaluated using the AUROC. The feature importance was assessed using Shapley additive explanations (SHAP). Results: The XGBoost algorithm demonstrated a strong predictive performance. The “with-HbA1c” model achieved an AUROC of 0.824, while the “non–HbA1c” model attained a comparable AUROC of 0.819. Both models were well-calibrated. SHAP analysis identified age, HbA1c, and systolic blood pressure as the most influential predictors. Conclusions: Our simplified, interpretable ML models can effectively stratify the risk of reduced kidney function in patients with T2D using minimal, routine data. These models represent a promising step toward integration into clinical practice, such as through EHR-based alerts or patient-facing mobile applications, to improve early CKD detection, particularly in resource-limited settings.

## Linked entities

- **Diseases:** chronic kidney disease (MONDO:0005300), type 2 diabetes (MONDO:0005148)

## Full-text entities

- **Diseases:** T2D (MESH:D003924), CKD (MESH:D051436), Reduced Kidney Function (MESH:D007680)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12250820/full.md

---
Source: https://tomesphere.com/paper/PMC12250820