# Development and validation of a high-performance clinical predictive model for early identification of non-alcoholic fatty liver disease

**Authors:** Tong Liang, Junli Ren

PMC · DOI: 10.3389/fphys.2026.1689882 · Frontiers in Physiology · 2026-02-12

## TL;DR

This study developed a high-performance model to predict non-alcoholic fatty liver disease using clinical data, offering a scalable and cost-effective solution for early identification and intervention.

## Contribution

A novel clinical predictive model for NAFLD with strong calibration and high accuracy using logistic regression and real-world data.

## Key findings

- The model achieved an ROC area of 0.80 in training and 0.78 in validation, showing strong discrimination.
- Calibration tests showed low mean absolute error (0.016 in training, 0.012 in validation), indicating accurate risk prediction.
- Comprehensive metrics (F1 score: 0.76, precision: 0.71, recall: 0.82) confirmed the model's robustness and clinical utility.

## Abstract

Non-alcoholic fatty liver disease (NAFLD) remains a significant global health challenge, imposing substantial clinical and economic burdens. There is an urgent need to develop reliable predictive tools for early identification and intervention.

This study drew on Dryad database data to create and verify a clinical NAFLD predictive model, incorporating key parameters from 1,592 subjects randomly split into training and validation groups. We employed logistic regression on the training set to construct the model, visualized and internally validated it in R, and gauged its net benefit via decision curve analysis. The validation set underwent external assessment, with performance metrics including F1 score, precision, and recall.

The model showed strong discrimination, with an receiver operating characteristic curve area of 0.80 (95% confidence interval: 0.77–0.82) in training and 0.78 in validation, indicating high accuracy in NAFLD risk prediction. Calibration tests showed close alignment between predicted and actual risks, with mean absolute error values of 0.016 (training) and 0.012 (validation). Comprehensive metrics (F1 score: 0.76, precision: 0.71, recall: 0.82) reinforced its robustness and clinical value.

This study’s results confirm the effective creation of an NAFLD predictive tool boasting high calibration accuracy and outstanding performance. Leveraging readily available clinical data, the model offers a scalable, economical approach to NAFLD, poised to pioneer a new paradigm for its precise prevention and control, and enable personalized prevention and efficient resource allocation.

## Linked entities

- **Diseases:** non-alcoholic fatty liver disease (MONDO:0013209), NAFLD (MONDO:0013209)

## Full-text entities

- **Genes:** GPT (glutamic--pyruvic transaminase) [NCBI Gene 2875] {aka AAT1, ALT, ALT1, GPT1, SGPT}, CCL11 (C-C motif chemokine ligand 11) [NCBI Gene 6356] {aka SCYA11}, Aspartate aminotransferase [NCBI Gene 107763872]
- **Diseases:** dyslipidemia (MESH:D050171), metabolic syndrome (MESH:D024821), cirrhosis (MESH:D005355), inflammation (MESH:D007249), tobacco (MESH:D014029), liver fibrosis (MESH:D008103), Diabetes (MESH:D003920), NAFLD (MESH:D065626), NASH (MESH:D005235), obese (MESH:D009765), splenomegaly (MESH:D013163), fatty liver (MESH:D005234), atherosclerosis (MESH:D050197), death (MESH:D003643), hypertension (MESH:D006973), portal hypertension (MESH:D006975), insulin resistance (MESH:D007333), hepatic injury (MESH:D056486), adiposity (MESH:D018205), type 2 diabetes (MESH:D003924), TC (OMIM:275350), liver cancer (MESH:D006528)
- **Chemicals:** TC (MESH:D013667), Triglycerides (MESH:D014280), cholesterol (MESH:D002784), copper (MESH:D003300), TG (MESH:D013866), LDL-C (-), lipid (MESH:D008055), glucose (MESH:D005947), alcohol (MESH:D000438)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12935685/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12935685/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12935685/full.md

---
Source: https://tomesphere.com/paper/PMC12935685