A Sequence-Based Mesh Classifier for the Prediction of Protein-Protein Interactions
Edgar D. Coelho, Igor N. Cruz, Andr\'e Santiago, Jos\'e Luis Oliveira,, Ant\'onio Dourado, Joel P. Arrais

TL;DR
This paper introduces a sequence-based machine learning model utilizing a mesh of classifiers to predict protein-protein interactions, achieving superior accuracy over existing methods through extensive validation.
Contribution
It presents a novel mesh classifier approach that combines physicochemical amino acid clustering and the discrete cosine transform for improved PPI prediction.
Findings
Achieved an average AUC of 0.84 with the SVM RBF model.
Outperformed state-of-the-art sequence-based PPI prediction methods.
Validated results across diverse datasets with cross-validation and out-of-sample testing.
Abstract
The worldwide surge of multiresistant microbial strains has propelled the search for alternative treatment options. The study of Protein-Protein Interactions (PPIs) has been a cornerstone in the clarification of complex physiological and pathogenic processes, thus being a priority for the identification of vital components and mechanisms in pathogens. Despite the advances of laboratorial techniques, computational models allow the screening of protein interactions between entire proteomes in a fast and inexpensive manner. Here, we present a supervised machine learning model for the prediction of PPIs based on the protein sequence. We cluster amino acids regarding their physicochemical properties, and use the discrete cosine transform to represent protein sequences. A mesh of classifiers was constructed to create hyper-specialised classifiers dedicated to the most relevant pairs of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Machine Learning in Bioinformatics · Genomics and Phylogenetic Studies
MethodsDiscrete Cosine Transform · Support Vector Machine
