ProtNN: Fast and Accurate Nearest Neighbor Protein Function Prediction based on Graph Embedding in Structural and Topological Space
Wajdi Dhifli, Abdoulaye Banir\'e Diallo

TL;DR
ProtNN is a fast, accurate, and scalable method for protein function prediction that uses graph embeddings and nearest neighbor classification, significantly outperforming existing approaches in speed and efficiency.
Contribution
This paper introduces ProtNN, a novel graph embedding-based approach for protein function prediction that achieves unprecedented speed and scalability compared to prior methods.
Findings
ProtNN accurately classifies protein functions across multiple datasets.
ProtNN runs thousands of times faster than state-of-the-art methods.
ProtNN scales efficiently to large datasets like the entire PDB.
Abstract
Studying the function of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the determination of the function of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the determination of protein functions in living cells. In this paper, we propose ProtNN, a novel approach for protein function prediction. Given an unannotated protein structure and a set of annotated proteins, ProtNN finds the nearest neighbor annotated structures based on protein-graph pairwise similarities. Given a query protein, ProtNN finds the nearest neighbor reference proteins based on a graph representation model and a pairwise similarity between vector embedding of both query and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Bioinformatics and Genomic Networks · Computational Drug Discovery Methods
