DeepGATGO: A Hierarchical Pretraining-Based Graph-Attention Model for   Automatic Protein Function Prediction

Zihao Li; Changkun Jiang; and Jianqiang Li

arXiv:2307.13004·q-bio.QM·July 26, 2023·5 cites

DeepGATGO: A Hierarchical Pretraining-Based Graph-Attention Model for Automatic Protein Function Prediction

Zihao Li, Changkun Jiang, and Jianqiang Li

PDF

Open Access

TL;DR

DeepGATGO is a hierarchical, sequence-based graph attention model that leverages pre-trained embeddings and contrastive learning to improve large-scale protein function prediction accuracy and scalability.

Contribution

It introduces a novel hierarchical approach using GATs and contrastive learning with pre-trained embeddings for protein function prediction based solely on sequences.

Findings

01

Outperforms existing methods in GO term enrichment analysis

02

Demonstrates better scalability on large datasets

03

Effectively captures intrinsic data features using GATs and contrastive learning

Abstract

Automatic protein function prediction (AFP) is classified as a large-scale multi-label classification problem aimed at automating protein enrichment analysis to eliminate the current reliance on labor-intensive wet-lab methods. Currently, popular methods primarily combine protein-related information and Gene Ontology (GO) terms to generate final functional predictions. For example, protein sequences, structural information, and protein-protein interaction networks are integrated as prior knowledge to fuse with GO term embeddings and generate the ultimate prediction results. However, these methods are limited by the difficulty in obtaining structural information or network topology information, as well as the accuracy of such data. Therefore, more and more methods that only use protein sequences for protein function prediction have been proposed, which is a more reliable and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBioinformatics and Genomic Networks · Machine Learning in Bioinformatics · Computational Drug Discovery Methods

Methodsfail · Contrastive Learning · Ontology