# POSA-GO: Fusion of Hierarchical Gene Ontology and Protein Language Models for Protein Function Prediction

**Authors:** Yubao Liu, Benrui Wang, Bocheng Yan, Haiyue Jiang, Yinfei Dai

PMC · DOI: 10.3390/ijms26136362 · 2025-07-01

## TL;DR

The paper introduces POSA-GO, a new method that combines protein sequences and gene ontology structure to better predict protein functions.

## Contribution

POSA-GO introduces a novel framework that integrates hierarchical GO terms and protein language models using partial order self-attention.

## Key findings

- POSA-GO outperforms existing methods in protein function prediction on CAFA3 and SwissProt datasets.
- The model uses ESM-2 and topological embeddings of GO terms to improve functional annotation accuracy.

## Abstract

Protein function prediction plays a crucial role in uncovering the molecular mechanisms underlying life processes in the post-genomic era. However, with the widespread adoption of high-throughput sequencing technologies, the pace of protein function annotation significantly lags behind that of sequence discovery, highlighting the urgent need for more efficient and reliable predictive methods. To address the problem of existing methods ignoring the hierarchical structure of gene ontology terms and making it challenging to dynamically associate protein features with functional contexts, we propose a novel protein function prediction framework, termed Partial Order-Based Self-Attention for Gene Ontology (POSA-GO). This cross-modal collaborative modelling approach fuses GO terms with protein sequences. The model leverages the pre-trained language model ESM-2 to extract deep semantic features from protein sequences. Meanwhile, it transforms the partial order relationships among Gene Ontology (GO) terms into topological embeddings to capture their biological hierarchical dependencies. Furthermore, a multi-head self-attention mechanism is employed to dynamically model the association weights between proteins and GO terms, thereby enabling context-aware functional annotation. Comparative experiments on the CAFA3 and SwissProt datasets demonstrate that POSA-GO outperforms existing state-of-the-art methods in terms of Fmax and AUPR metrics, offering a promising solution for protein functional studies.

## Full-text entities

- **Genes:** RYR1 (ryanodine receptor 1) [NCBI Gene 6261] {aka CCO, CMYO1A, CMYO1B, CMYP1A, CMYP1B, KDS}
- **Diseases:** injury to (MESH:D014947), ESM-2 (MESH:D020803)
- **Chemicals:** POSA (-), amino acid (MESH:D000596)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12250456/full.md

---
Source: https://tomesphere.com/paper/PMC12250456