# Extreme gradient boosting using conventional parameters accurately predicts insulin sensitivity in young and middle-aged Japanese persons

**Authors:** Norimitsu Murai, Naoko Saito, Sayuri Nii, Hiroto Nishikawa, Eriko Kodama, Tatsuya Iida, Hideyuki Imai, Mai Hashizume, Rie Tadokoro, Chiho Sugisawa, Toru Iizaka, Fumiko Otsuka, Shun Ishibashi, Shoichiro Nagasaka

PMC · DOI: 10.3389/fendo.2025.1661376 · 2025-10-17

## TL;DR

The study shows that machine learning can accurately predict insulin sensitivity in Japanese individuals using physical indicators and additional blood markers.

## Contribution

Extreme gradient boosting outperforms conventional methods in predicting insulin sensitivity using clinical factors.

## Key findings

- Extreme gradient boosting provided the best correlation with insulin sensitivity indices among ML methods.
- The contribution of clinical factors to insulin sensitivity varies by age and glucose tolerance status.
- Conventional lipid-related estimates showed weaker correlations with insulin sensitivity than ML-derived estimates.

## Abstract

This study tested the hypothesis that insulin sensitivity (SI) can be estimated using machine learning (ML) based only on physical indicators or with the addition of lipid and fasting glucose levels.

In 1,268 young (age <40 years, normal glucose tolerance; NGT) and 1,723 middle-aged Japanese persons with NGT (n=1,276) and glucose intolerance (n=447), the Matsuda index and the 1/homeostasis model assessment of insulin resistance were calculated as SI. In each group, SI was estimated by using eight ML methods, based only on physical indicators, as well as by using physical indicators together with lipid and fasting glucose levels. Moreover, 11 lipid-related estimates for SI were calculated.

Estimates by extreme gradient boosting showed the best correlations with SI indices among eight ML methods. According to feature importance and SHapley Additive exPlanations values, the contribution of each clinical factor to SI differed greatly by age and glucose tolerance status. Relationships of lipid-related estimates with SI were weaker than those of ML-derived estimates.

It was possible to estimate SI using ML based only on physical indicators, or those with lipid and fasting glucose levels. The results also imply that it would be difficult to establish universal and robust estimates for SI using conventional parameters. Further validation studies are necessary in diverse ethnic groups with various body composition.

## Linked entities

- **Diseases:** glucose intolerance (MONDO:0001076)

## Full-text entities

- **Genes:** INS (insulin) [NCBI Gene 3630] {aka IDDM, IDDM1, IDDM2, ILPR, IRDN, MODY10}
- **Diseases:** insulin resistance (MESH:D007333), glucose intolerance (MESH:D018149)
- **Chemicals:** lipid (MESH:D008055), glucose (MESH:D005947)

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12575152/full.md

---
Source: https://tomesphere.com/paper/PMC12575152