# Identifying Metabolite–Disease Associations via Messaging in Hypergraphs

**Authors:** Fuheng Xiao, Yihao Ran, Zhanchao Li

PMC · DOI: 10.3390/metabo16020116 · Metabolites · 2026-02-09

## TL;DR

This paper introduces a new machine learning framework called DHG-LGB that improves predictions of how metabolites are linked to diseases by using complex network structures.

## Contribution

The novel DHG-LGB framework uses hypergraphs to model complex multi-way biological interactions, outperforming traditional methods in predicting metabolite–disease associations.

## Key findings

- DHG-LGB achieved 98.87% accuracy and 0.9983 AUC in predicting metabolite–disease associations.
- The framework maintained strong performance across varying positive-to-negative ratios.
- Comparative evaluations confirmed DHG-LGB's superiority over existing methodologies.

## Abstract

Background: Traditional machine-learning approaches face challenges when attempting to integrate diverse biological information for predicting metabolite–disease relationships. The intricate connections linking metabolites, diseases, proteins, and Gene Ontology (GO) annotations present substantial obstacles for conventional pairwise graph representations, which prove inadequate for modeling such complex multi-way interactions. Methods: An innovative hypergraph-based framework (DHG-LGB) was developed to exploit this complexity through conceptualizing diseases as hyperedges. Within this architecture, individual hyperedges link multiple vertices including metabolites, proteins, and GO annotations, thereby enabling richer representation of the biological networks underlying metabolite–disease relationships. Metabolite–disease relationships were encoded as low-dimensional vectors through hypergraph neural network (HGNN) operations incorporating Laplacian smoothing and message propagation mechanisms. LightGBM (LGB) was used to construct a model for identifying the potential metabolite–disease associations. Results: Under 5-fold cross-validation, DHG-LGB achieved 98.87% accuracy, 91.77% sensitivity, 99.58% specificity, 95.60% precision, Matthews correlation coefficient (MCC) of 0.9305, receiver operating characteristic area under curve (AUC) of 0.9983, and precision-recall area under curve (AUPRC) of 0.9860. The framework maintained strong performance when tested with varying positive-to-negative ratios (spanning 1:1 through 1:10), consistently achieving AUC values exceeding 0.9954 and AUPRC values above 0.9820, thereby confirming excellent robustness and generalization capability. Comparative evaluations against existing methodologies verified the superiority of DHG-LGB. Conclusions: The DHG-LGB framework delivers more comprehensive modeling of biological interactions relative to conventional approaches and substantially enhances predictive accuracy for metabolite–disease relationships. It is foreseeable that it will be a valuable computational tool for biomarker identification and precision medicine initiatives.

## Full-text entities

- **Genes:** FOS (Fos proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 2353] {aka AP-1, C-FOS, p55}, AGRP (agouti related neuropeptide) [NCBI Gene 181] {aka AGRT, ART, ASIP2}, IL1B (interleukin 1 beta) [NCBI Gene 3553] {aka IL-1, IL1-BETA, IL1F2, IL1beta}, PPARGC1A (PPARG coactivator 1 alpha) [NCBI Gene 10891] {aka LEM6, PGC-1(alpha), PGC-1alpha, PGC-1v, PGC1, PGC1A}, ADORA1 (adenosine A1 receptor) [NCBI Gene 134] {aka RDC7}, IRS1 (insulin receptor substrate 1) [NCBI Gene 3667] {aka HIRS-1}, PCK1 (phosphoenolpyruvate carboxykinase 1) [NCBI Gene 5105] {aka PCKDC, PEPCK-C, PEPCK1, PEPCKC}, SUCNR1 (succinate receptor 1) [NCBI Gene 56670] {aka GPR91}, HIF1A (hypoxia inducible factor 1 subunit alpha) [NCBI Gene 3091] {aka HIF-1-alpha, HIF-1A, HIF-1alpha, HIF1, HIF1-ALPHA, MOP1}
- **Diseases:** liver disease (MESH:D008107), inflammation (MESH:D007249), injury to (MESH:D014947), Disease (MESH:D004194), neurodegenerative (MESH:D019636), Parkinson's disease (MESH:D010300), neurotoxic (MESH:D020258), Alzheimer's (MESH:D000544), neurological and psychiatric conditions (MESH:D001523), NMDA receptor hypofunction (MESH:D060426), cancer (MESH:D009369), diabetes mellitus (MESH:D003920), Schizophrenia (MESH:D012559), impaired glucose metabolism (MESH:D044882), Obesity (MESH:D009765), autoimmune diseases (MESH:D001327), metabolic disorders (MESH:D008659), metastasis (MESH:D009362), Crohn's (MESH:D003424), psychosis (MESH:D011618), LGB (MESH:D000141), Hypergraph Nodal (MESH:D013611), insulin resistance (MESH:D007333), infectious disease (MESH:D003141), memory deficits (MESH:D008569)
- **Chemicals:** Succinic acid (MESH:D019802), L-Proline (MESH:D011392), Hypoxanthine (MESH:D019271), vanillic acid (MESH:D014641), omega-3 fatty acid (MESH:D015525), HVA (MESH:D006719), DHEAS (MESH:D019314), Fumaric acid (MESH:C032005), L-Alanine (MESH:D000409), tricarboxylic acid (MESH:D014233), L-Leucine (MESH:D007930), Branched-chain amino acids (MESH:D000597), GABA (MESH:D005680), Adenine (MESH:D000225), Citrulline (MESH:D002956), malate (MESH:C030298), 8-OHdG (MESH:D000080242), ketone body (MESH:D007657), Ethanol (MESH:D000431), Sarcosine (MESH:D012521), DHG (-), Curcumin (MESH:D003474), L-Aspartic acid (MESH:D001224), fatty acid (MESH:D005227), Taurine (MESH:D013654), glutathione (MESH:D005978), lipid (MESH:D008055), DHEA (MESH:D003687), DHA (MESH:D004281), fumarate (MESH:D005650), glucose (MESH:D005947), short-chain fatty acid (MESH:D005232)
- **Species:** Bacteroides (genus) [taxon 816], Faecalibacterium prausnitzii (species) [taxon 853], Homo sapiens (human, species) [taxon 9606], Curcuma longa (turmeric, species) [taxon 136217]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12942751/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12942751/full.md

## References

58 references — full list in the complete paper: https://tomesphere.com/paper/PMC12942751/full.md

---
Source: https://tomesphere.com/paper/PMC12942751