# Machine learning-based estimation of trunk fat percentage and its association with cardiometabolic risk leveraging two large national cohorts

**Authors:** Liangming Zeng, Xuemin Guo, Hesen Wu, Changjing Huang

PMC · DOI: 10.3389/fnut.2026.1715570 · 2026-01-22

## TL;DR

This study developed a machine learning model to estimate trunk fat using basic measurements and found it better predicts cardiometabolic risks than whole-body fat.

## Contribution

A simplified, accurate machine learning model for trunk fat estimation that outperforms whole-body fat in predicting cardiometabolic diseases.

## Key findings

- The XGBoost model achieved an R2 of 0.8509 in estimating trunk fat percentage.
- A simplified model with five variables retained 99.3% of the full model's accuracy.
- Trunk fat percentage outperformed whole-body fat in predicting diabetes and other cardiometabolic conditions.

## Abstract

This study aimed to develop and validate a machine learning model for accurate estimation of trunk fat percentage using readily available anthropometric measures, and to evaluate its discriminative performance for cardiometabolic diseases compared with conventional whole-body fat percentage.

We utilized data from the National Health and Nutrition Examination Survey (NHANES; 1999–2006 and 2011–2018) as the development cohort (n = 30,443). Trunk fat percentage, measured by dual-energy X-ray absorptiometry (DXA), served as the gold standard. Six regression algorithms were evaluated, with model performance assessed by the coefficient of determination (R2). External validation was performed using the China Health and Retirement Longitudinal Study (CHARLS) cohort (n = 13,524), where the discriminative power for hypertension, dyslipidemia, diabetes, heart disease, and stroke was evaluated using the area under the receiver operating characteristic curve (AUC).

The XGBoost model demonstrated superior performance in the development cohort, achieving an R2 of 0.8509 on the test set. A simplified model utilizing only five variables (sex, waist circumference, height, weight, and age) retained 99.3% of the full model’s accuracy (R2 = 0.8450). In external validation, the machine learning-estimated trunk fat percentage consistently outperformed whole-body fat percentage across all cardiometabolic conditions, with the highest AUC improvement observed for diabetes (trunk fat AUC = 0.6607 vs. whole-body fat AUC = 0.6401; relative improvement of 3.22%). The average relative improvement in AUC across all endpoints was 2.77%.

This study presents a highly accurate and clinically practical machine learning model for trunk fat percentage estimation using five basic anthropometric measurements. External validation confirms that trunk fat percentage is a superior biomarker for identifying cardiometabolic risks compared to whole-body fat percentage. The model provides a reliable tool for non-invasive central adiposity assessment in large-scale epidemiological studies and clinical practice.

## Linked entities

- **Diseases:** dyslipidemia (MONDO:0002525), diabetes (MONDO:0005015), heart disease (MONDO:0005267), stroke (MONDO:0005098)

## Full-text entities

- **Diseases:** hypertension (MESH:D006973), stroke (MESH:D020521), diabetes (MESH:D003920), cardiometabolic diseases (MESH:D024821), dyslipidemia (MESH:D050171), heart disease (MESH:D006331), adiposity (MESH:D018205)

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12872505/full.md

---
Source: https://tomesphere.com/paper/PMC12872505