GOProteinGNN: Leveraging Protein Knowledge Graphs for Protein Representation Learning
Dan Kalifa, Uriel Singer, Kira Radinsky

TL;DR
GOProteinGNN is a novel graph-based model that integrates protein knowledge graphs with amino acid sequences to produce enriched, context-aware protein representations, improving performance on various biological tasks.
Contribution
It introduces a new architecture that combines protein knowledge graphs with language models, learning the entire graph during training for more comprehensive protein representations.
Findings
Outperforms previous protein representation methods on multiple tasks
Effectively captures complex protein relationships and functional annotations
Establishes state-of-the-art results in protein representation learning
Abstract
Proteins play a vital role in biological processes and are indispensable for living organisms. Accurate representation of proteins is crucial, especially in drug development. Recently, there has been a notable increase in interest in utilizing machine learning and deep learning techniques for unsupervised learning of protein representations. However, these approaches often focus solely on the amino acid sequence of proteins and lack factual knowledge about proteins and their interactions, thus limiting their performance. In this study, we present GOProteinGNN, a novel architecture that enhances protein language models by integrating protein knowledge graph information during the creation of amino acid level representations. Our approach allows for the integration of information at both the individual amino acid level and the entire protein level, enabling a comprehensive and effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Machine Learning in Bioinformatics · Bioinformatics and Genomic Networks
MethodsFocus
