PRED-CLASS: cascading neural networks for generalized protein classification and genome-wide applications
Claude Pasquier, Vasilis Promponas, Stavros Hamodrakas

TL;DR
PRED-CLASS is a hierarchical neural network system that accurately classifies proteins into four categories using amino acid sequences, aiding genome annotation and structural prediction.
Contribution
This work introduces a simple, fast, and effective cascading neural network architecture for generalized protein classification from sequence data.
Findings
Achieved approximately 96% accuracy on test proteins
Successfully classified proteins with minimal training data
Demonstrated applicability to complete proteomes
Abstract
A cascading system of hierarchical, artificial neural networks (named PRED-CLASS) is presented for the generalized classification of proteins into four distinct classes-transmembrane, fibrous, globular, and mixed-from information solely encoded in their amino acid sequences. The architecture of the individual component networks is kept very simple, reducing the number of free parameters (network synaptic weights) for faster training, improved generalization, and the avoidance of data overfitting. Capturing information from as few as 50 protein sequences spread among the four target classes (6 transmembrane, 10 fibrous, 13 globular, and 17 mixed), PRED-CLASS was able to obtain 371 correct predictions out of a set of 387 proteins (success rate approximately 96%) unambiguously assigned into one of the target classes. The application of PRED-CLASS to several test sets and complete proteomes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
