# Serum biomarker screening and metabolic profiling analysis of nonalcoholic fatty liver disease patients using untargeted metabolomics and machine learning techniques

**Authors:** Lijuan Dan, Huan Shi, Lina Cao, Xiuyan Li, Lirui Kong, Xiaojie You, Wenping Liu, Yanwei Hao, Dong Wang, Hongfei Song, Jie Mu, Qiao Li

PMC · DOI: 10.3389/fmolb.2026.1730023 · Frontiers in Molecular Biosciences · 2026-02-09

## TL;DR

This study uses metabolomics and machine learning to identify serum biomarkers for early detection of nonalcoholic fatty liver disease (NAFLD), revealing key metabolic pathways and potential diagnostic markers.

## Contribution

The study identifies novel serum metabolites and metabolic pathways associated with NAFLD using untargeted metabolomics and machine learning, offering potential diagnostic biomarkers.

## Key findings

- 942 significantly differential metabolites were identified between NAFLD patients and healthy controls.
- Metabolites like 1-methyluric acid showed strong diagnostic potential for NAFLD.
- Key metabolic pathways such as cholesterol metabolism and FoxO signaling were enriched in NAFLD.

## Abstract

Non-alcoholic fatty liver disease (NAFLD) represents the most prevalent chronic hepatic metabolic disorder globally. Without timely intervention, it can progress to non-alcoholic steatohepatitis (NASH), liver fibrosis, and even hepatocellular carcinoma. Early detection and diagnosis are critical for disease management. metabolomics, a powerful tool for identifying diagnostic metabolic biomarkers of diseases, is frequently integrated with machine learning (ML) algorithms to improve analytical efficiency. This study aims to compare serum metabolomic profiles between NAFLD patients and healthy controls, identify differential metabolites, and employ machine learning algorithms to discover biomarkers with diagnostic value.

This study enrolled 26 healthy controls and 165 patients diagnosed with NAFLD via ultrasound, and performed serum untargeted metabolomics analysis. Specifically, metabolomics techniques were used to detect serum metabolites, while orthogonal partial least squares-discriminant analysis (OPLS-DA) was applied to screen for significantly differential metabolites between groups and conduct pathway enrichment analysis. In the ML phase, the dataset was split at an 8:2 ratio: 80% of the data (131 NAFLD cases and 21 healthy controls) was used for model training, and 20% (34 NAFLD cases and five healthy controls) served as an independent test set to validate model performance.

Metabolomic differential analysis identified 942 significantly differential metabolites (656 upregulated and 286 downregulated) between the NAFLD and healthy control groups, which were primarily enriched in caffeine metabolism, cholesterol metabolism, and the FoxO and AMPK signaling pathways. After training and validating machine learning models, serum metabolites maresin 1, canavaninosuccinate, paraxanthine, and 1-methyluric acid demonstrated robust diagnostic performance for NAFLD and can serve as independent predictive biomarkers, with 1-methyluric acid exhibiting the highest diagnostic contribution.

Integration of untargeted metabolomics and machine learning effectively distinguishes NAFLD patients from healthy controls. cholesterol metabolism, caffeine metabolism, and the FoxO and AMPK signaling pathways may participate in NAFLD pathogenesis. ML-validated metabolites 1-methyluric acid, paraxanthine, canavaninosuccinate, and maresin one hold potential as diagnostic biomarkers and therapeutic targets for NAFLD, with 1-methyluric acid exhibiting the highest diagnostic relevance. In summary, serum metabolomics provides stable, accurate biomarkers for NAFLD early warning and diagnosis, and this study offers data and resource support for optimizing its clinical.

## Linked entities

- **Chemicals:** maresin 1 (PubChem CID 60201795), canavaninosuccinate (PubChem CID 25201068), paraxanthine (PubChem CID 4687), 1-methyluric acid (PubChem CID 69726)
- **Diseases:** nonalcoholic fatty liver disease (MONDO:0013209), non-alcoholic steatohepatitis (MONDO:0007027), hepatocellular carcinoma (MONDO:0007256)

## Full-text entities

- **Genes:** MAPK8 (mitogen-activated protein kinase 8) [NCBI Gene 5599] {aka JNK, JNK-46, JNK1, JNK1A2, JNK21B1/2, PRKM8}, APOA1 (apolipoprotein A1) [NCBI Gene 335] {aka AMYLD3, HPALP2, apo(a)}, DUSP9 (dual specificity phosphatase 9) [NCBI Gene 1852] {aka MKP-4, MKP4}, MTOR (mechanistic target of rapamycin kinase) [NCBI Gene 2475] {aka FRAP, FRAP1, FRAP2, RAFT1, RAPT1, SKS}, SLC17A5 (solute carrier family 17 member 5) [NCBI Gene 26503] {aka AST, ISSD, NSD, SD, SIALIN, SIASD}, FOXO1 (forkhead box O1) [NCBI Gene 2308] {aka FKH1, FKHR, FOXO1A}, PRKAA1 (protein kinase AMP-activated catalytic subunit alpha 1) [NCBI Gene 5562] {aka AMPK, AMPK alpha 1, AMPKa1}, CYP2E1 (cytochrome P450 family 2 subfamily E member 1) [NCBI Gene 1571] {aka CPE1, CYP2E, P450-J, P450C2E}, PIK3CB (phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit beta) [NCBI Gene 5291] {aka P110BETA, PI3K, PI3KBETA, PIK3C1}, MAP3K5 (mitogen-activated protein kinase kinase kinase 5) [NCBI Gene 4217] {aka ASK1, MAPKKK5, MEKK5}, SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}, GGT1 (gamma-glutamyltransferase 1) [NCBI Gene 2678] {aka CD224, D22S672, D22S732, GGT, GGT 1, GGTD}, MAPK14 (mitogen-activated protein kinase 14) [NCBI Gene 1432] {aka CSBP, CSBP1, CSBP2, CSPB1, EXIP, Mxi2}, AKT1 (AKT serine/threonine kinase 1) [NCBI Gene 207] {aka AKT, PKB, PKB-ALPHA, PRKBA, RAC, RAC-ALPHA}, CASP6 (caspase 6) [NCBI Gene 839] {aka CSP-6, MCH2, caspase-6}, GGTLC5P (gamma-glutamyltransferase light chain 5 pseudogene) [NCBI Gene 653590] {aka GGT}, ABCA1 (ATP binding cassette subfamily A member 1) [NCBI Gene 19] {aka ABC-1, ABC1, CERP, HDLCQTL13, HDLDT1, HPALP1}, PRKAB1 (protein kinase AMP-activated non-catalytic subunit beta 1) [NCBI Gene 5564] {aka AMPK, HAMPKb}, VIP (vasoactive intestinal peptide) [NCBI Gene 7432] {aka PHM27}, GPT (glutamic--pyruvic transaminase) [NCBI Gene 2875] {aka AAT1, ALT, ALT1, GPT1, SGPT}, SIRT1 (sirtuin 1) [NCBI Gene 23411] {aka SIR2, SIR2L1, SIR2alpha}
- **Diseases:** death (MESH:D003643), viral hepatitis (MESH:D014777), chronic viral hepatitis (MESH:D006525), insulin resistance (MESH:D007333), inflammatory bowel disease (MESH:D015212), Hepatitis (MESH:D056486), liver failure (MESH:D017093), cholesterol (MESH:C535937), hepatocellular carcinoma (MESH:D006528), hepatic lipid (MESH:D011017), necrosis (MESH:D009336), autoimmune hepatitis (MESH:D019693), chronic (MESH:D002908), hepatic metabolic disorder (MESH:D008107), hepatic inflammation (MESH:D007249), metabolic syndrome (MESH:D024821), cirrhosis (MESH:D005355), cirrhotic (MESH:D000094724), NAFLD (MESH:D065626), malignancies (MESH:D009369), diabetes (MESH:D003920), liver fibrosis (MESH:D008103), MASH (MESH:D005234), NASH (MESH:D005235), obesity (MESH:D009765), metabolic diseases (MESH:D008659), Cushing's syndrome (MESH:D003480), Wilson's disease (MESH:D006527), hepatocyte damage (MESH:D020263), celiac disease (MESH:D002446)
- **Chemicals:** 2-chloro-L-phenylalanine (-), bile acid (MESH:D001647), sulfur (MESH:D013455), Caffeine (MESH:D002110), lignans (MESH:D017705), amino acids (MESH:D000596), vitamin D3 (MESH:D002762), carbohydrates (MESH:D002241), fatty acid (MESH:D005227), steroid hormone (MESH:D013256), lipid (MESH:D008055), nucleosides (MESH:D009705), alcohol (MESH:D000438), glucose (MESH:D005947), methanol (MESH:D000432), ammonium formate (MESH:C030544), 1 - methyluric acid (MESH:C030530), nitrogen (MESH:D009584), triglyceride (MESH:D014280), acetonitrile (MESH:C032159), PF-06409577 (MESH:C000617640), tricarboxylic acid (MESH:D014233), lipid peroxide (MESH:D008054), benzene (MESH:D001554), water (MESH:D014867), free fatty acid (MESH:D005230), Paraxanthine (MESH:C021183), alkaloids (MESH:D000470), Cholesterol (MESH:D002784), ethanol (MESH:D000431)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090]
- **Cell lines:** HepG2 — Homo sapiens (Human), Hepatoblastoma, Cancer cell line (CVCL_0027)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12926098/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12926098/full.md

## References

57 references — full list in the complete paper: https://tomesphere.com/paper/PMC12926098/full.md

---
Source: https://tomesphere.com/paper/PMC12926098