# Graph Learning in Bioinformatics: A Survey of Graph Neural Network Architectures, Biological Graph Construction and Bioinformatics Applications

**Authors:** Lijia Deng, Ziyang Dong, Zhengling Yang, Bo Gong, Le Zhang

PMC · DOI: 10.3390/biom16020333 · Biomolecules · 2026-02-23

## TL;DR

This paper reviews how graph neural networks are used in bioinformatics to model complex biological systems and improve predictions in areas like drug discovery and disease analysis.

## Contribution

The paper provides a structured framework for understanding and applying GNNs in bioinformatics, covering graph construction, architectures, and applications.

## Key findings

- GNNs are effective for modeling biological data due to their ability to capture complex dependencies in non-Euclidean structures.
- The paper highlights the importance of biological graph construction and representation in determining GNN performance.
- Applications of GNNs span disease-gene associations, drug discovery, and multi-omics integration.

## Abstract

Graph Neural Networks (GNNs) have become a central methodology for modelling biological systems where entities and their interactions form inherently non-Euclidean structures. From protein interaction networks and gene regulatory circuits to molecular graphs and multi-omics integration, the relational nature of biological data makes GNNs particularly well-suited for capturing complex dependencies that traditional deep learning methods fail to represent. Despite their rapid adoption, the effectiveness of GNNs in bioinformatics depends not only on model design but also on how biological graphs are constructed, parameterised and trained. In this review, we provide a structured framework for understanding and applying GNNs in bioinformatics, organised around three key dimensions: (1) graph construction and representation, including strategies for deriving biological networks from heterogeneous sources and selecting biologically meaningful node and edge features; (2) GNN architectures, covering spectral and spatial formulations, representative models such as Graph Convolutional Networks (GCNs), Graph Attention Networks (GATs), Graph Sample and AggregatE (GraphSAGE) and Graph Isomorphism Network (GIN), and recent advances including transformer-based and self-supervised paradigms; and (3) applications in biomedical domains, spanning disease–gene association prediction, drug discovery, protein structure and function analysis, multi-omics integration and biomedical knowledge graphs. We further examine training considerations, including optimisation techniques, regularisation strategies and challenges posed by data sparsity and noise in biological settings. By synthesising methodological foundations with domain-specific applications, this review clarifies how graph quality, architectural choice and training dynamics jointly influence model performance. We also highlight emerging challenges such as modelling temporal biological processes, improving interpretability, and enabling robust multimodal fusion that will shape the next generation of GNNs in computational biology.

## Full-text entities

- **Genes:** EGFR (epidermal growth factor receptor) [NCBI Gene 1956] {aka ERBB, ERBB1, ERRP, HER1, NISBD2, NNCIS}, EREG (epiregulin) [NCBI Gene 2069] {aka EPR, ER, Ep}, TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, LINC02574 (long intergenic non-protein coding RNA 2574) [NCBI Gene 111216282] {aka HEAL}, TTC41P (tetratricopeptide repeat domain 41, pseudogene) [NCBI Gene 253724] {aka GNN, GNNP}, GRN (granulin precursor) [NCBI Gene 2896] {aka CLN11, FTD2, GEP, GP88, PCDGF, PEPI}
- **Diseases:** toxicity (MESH:D064420), infection (MESH:D007239), PROTEIN (MESH:D011488), carcinogenicity (MESH:D011230), LGCL (MESH:D016369), epilepsy (MESH:D004827), SGCN (MESH:D008569), sepsis (MESH:D018805), WSI (MESH:C564543), DDI (MESH:D000081015), depression (MESH:D003866), GNNs (MESH:D015441), Alzheimer's Disease (MESH:D000544), Cancer (MESH:D009369), inflammation (MESH:D007249), injury to (MESH:D014947), neurological disorders (MESH:D009461), PTC (MESH:D000077273), NCI-I (MESH:D006969), AIDS (MESH:D000163), PATCHY (MESH:C531609)
- **Chemicals:** amino acid (MESH:D000596), GCN (-), acid (MESH:D000143), ZINC (MESH:D015032)
- **Species:** Homo sapiens (human, species) [taxon 9606], Salmonella enterica subsp. enterica serovar Typhimurium (no rank) [taxon 90371], Human immunodeficiency virus 1 (no rank) [taxon 11676]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12938586/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12938586/full.md

## References

200 references — full list in the complete paper: https://tomesphere.com/paper/PMC12938586/full.md

---
Source: https://tomesphere.com/paper/PMC12938586