# GNN-FTuckER: A novel link prediction model for identifying suitable populations for tea varieties

**Authors:** Jun Li, Bing Yang, Jiaxin Liu, Xu Wang, Zhongyuan Wu, Qiang Huang, Peng He

PMC · DOI: 10.1371/journal.pone.0323315 · PLOS One · 2025-05-27

## TL;DR

This paper introduces a new model called GNN-FTuckER to predict which tea varieties are suitable for specific populations using graph neural networks and tensor decomposition.

## Contribution

The novel GNN-FTuckER model combines SE-GNN and an improved TuckER decoder for enhanced link prediction in tea suitability.

## Key findings

- GNN-FTuckER outperformed existing models on public datasets and the TeaPle dataset.
- Ablation studies showed performance improvements in H@10 and MRR metrics across datasets.

## Abstract

Current research on tea primarily focuses on foundational studies of phenotypic characteristics, with insufficient exploration of the relationship between tea varieties and suitable populations. To address this issue, this paper proposes a link prediction model based on Graph Neural Networks (GNN) and tensor decomposition, named GNN-FTuckER, designed to predict the “tea suitability” relationships within the tea knowledge graph. This model integrates the SE-GNN structural encoder with an improved TuckER model decoder. The SE-GNN encoder enhances the modeling capability of the global graph structure by explicitly modeling relations, entities, and triples, thereby obtaining embedding vectors through aggregation, updating, and iterative operations. The improved TuckER model enhances the capture of complex semantics between entities and relations by introducing nonlinear activation functions. To support our research, we constructed a tea dataset, TeaPle. In comparative experiments, GNN-FTuckER achieved superior performance on both public datasets (WN18RR, FB15k-237) and the TeaPle dataset. Ablation studies indicate that the model improved H@10 by 4.3% on the WN18RR dataset and by 1.5% on the FB15k-237 dataset, with a 1.3% increase in MRR. In the TeaPle dataset, H@3 improved by 4.7% and H@10 increased by 3.1%. This research provides significant insights for further exploring the potential of tea varieties and evaluating the health benefits of tea consumption.

## Full-text entities

- **Genes:** TTC41P (tetratricopeptide repeat domain 41, pseudogene) [NCBI Gene 253724] {aka GNN, GNNP}
- **Diseases:** fatigue (MESH:D005221), obesity (MESH:D009765), MetS (MESH:D024821), liver disease (MESH:D008107), SE (MESH:D057180), constipation (MESH:D003248), dryness (MESH:D014987), acute hepatitis (MESH:D017114)
- **Chemicals:** blood sugar (MESH:D001786), cholesterol (MESH:D002784), water (MESH:D014867), green tea extract (MESH:C045651), alcohol (MESH:D000438), FB15k-237 (-)
- **Species:** Camellia sinensis (black tea, species) [taxon 4442], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12112203/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12112203/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/PMC12112203/full.md

---
Source: https://tomesphere.com/paper/PMC12112203