# Development of a machine learning model for hepatic steatosis screening using non-invasive Traditional Chinese Medicine diagnostics and clinical variables: a health checkup study with community screening potential

**Authors:** Ke Zhu, Lihua Li, Zhihui Zhao, Sheng Zheng, Bing Lin, Wenjun Tang, Weihong Li

PMC · DOI: 10.3389/fmed.2025.1704441 · Frontiers in Medicine · 2026-01-14

## TL;DR

A machine learning model using Traditional Chinese Medicine diagnostics and clinical variables was developed to screen for hepatic steatosis with high accuracy.

## Contribution

Combining non-invasive TCM diagnostics with clinical variables to predict hepatic steatosis in a community screening context.

## Key findings

- XGBoost and logistic regression models achieved AUCs of 0.84 and 0.83, respectively, for hepatic steatosis prediction.
- TCM features like HSV_H of nose and T5, along with BMI and weight, were key predictors in the model.
- The model showed robust performance and potential for community-based screening of liver disease.

## Abstract

Steatotic liver disease (SLD), underpinned by hepatic steatosis, is a global health concern affecting approximately 30% of the population. Current screening methods primarily rely on laboratory tests and lack broad-spectrum applicability. This study aims to develop a predictive model by selecting from non-invasive Traditional Chinese Medicine (TCM) diagnostics, demographic, and anthropometric variables to enhance early detection of hepatic steatosis.

Data from 1,703 local residents undergoing health checkup at the health management center of Affiliated Hospital of Chengdu University of Traditional Chinese Medicine between December 2018 and December 2021 were analyzed. Demographic, anthropometric, and TCM diagnostic data were collected using questionnaires and standardized instruments. Hepatic steatosis was diagnosed via ultrasonography. Predictive models were developed using three parametric and six non-parametric algorithms, evaluated through nested five-fold stratified cross-validation. Performance was evaluated in terms of discrimination, classification metrics at the optimal threshold, calibration, and clinical utility.

Anthropometric variables body mass index (BMI), weight, diastolic blood pressure, and TCM diagnostic indicators HSV_H of nose, T5, phlegm-dampness constitution score, RGB_R of mid tongue, Lab_A of lip, T4, H5, and Lab_A of orbit, a total of 11 variables were selected as predictors. Logistic regression (AUC 0.83, 95% CI: 0.809–0.850) and XGBoost (AUC 0.84, 95% CI: 0.818–0.859) achieved the highest AUC among parametric and non-parametric models, respectively. XGBoost showed marginally better performance than logistic regression in AUC and clinical utility. Difference of classification metrics, calibration slops, and calibration intercepts of the two models was not statistically significant. SHAP analysis identified BMI and body weight as the most influential predictors, alongside substantial contributions from TCM features (HSV_H of nose and T5).

TCM features combined with anthropometric variables can be used to develop a non-invasive screening model for ultrasound-diagnosed hepatic steatosis. Both the XGBoost and Logistic Regression models demonstrated robust performance, though external validation is needed to confirm generalizability. This non-invasive approach offers a practical tool with potential for hepatic steatosis screening in community settings.

## Full-text entities

- **Diseases:** Hepatic steatosis (MESH:D005234), SLD (MESH:D008107)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12847276/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12847276/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/PMC12847276/full.md

---
Source: https://tomesphere.com/paper/PMC12847276