GraphPI: Efficient Protein Inference with Graph Neural Networks
Zheng Ma, Jiazhen Chen, Lei Xin, and Ali Ghodsi

TL;DR
GraphPI introduces a graph neural network framework for protein inference that leverages unlabeled data with pseudo-labels, demonstrating universal applicability and improved efficiency over existing methods.
Contribution
It presents a novel GNN-based approach for protein inference that works without dataset-specific training and uses self-training with pseudo-labels to overcome label scarcity.
Findings
GraphPI achieves strong performance on multiple test datasets.
It significantly reduces computation time compared to existing algorithms.
The method does not require dataset-specific fine-tuning due to normalized features.
Abstract
The integration of deep learning approaches in biomedical research has been transformative, enabling breakthroughs in various applications. Despite these strides, its application in protein inference is impeded by the scarcity of extensively labeled datasets, a challenge compounded by the high costs and complexities of accurate protein annotation. In this study, we introduce GraphPI, a novel framework that treats protein inference as a node classification problem. We treat proteins as interconnected nodes within a protein-peptide-PSM graph, utilizing a Graph Neural Network-based architecture to elucidate their interrelations. To address label scarcity, we train the model on a set of unlabeled public protein datasets with pseudo-labels derived from an existing protein inference algorithm, enhanced by self-training to iteratively refine labels based on confidence scores. Contrary to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
