An hierarchical artificial neural network system for the classification of transmembrane proteins
Claude Pasquier, Stavros Hamodrakas

TL;DR
This paper introduces a hierarchical neural network that accurately classifies proteins as membrane or non-membrane using only sequence data, achieving high accuracy and practical application in genome analysis.
Contribution
It presents a simple, fast neural network model with high accuracy for protein classification, integrated into a new software package for genome and database analysis.
Findings
100% accuracy on membrane protein identification
97.7% correct classification of globular proteins
Effective application to complete genomes and SWISS-PROT database
Abstract
This work presents a simple artificial neural network which classifies proteins into two classes from their sequences alone: the membrane protein class and the non-membrane protein class. This may be important in the functional assignment and analysis of open reading frames (ORF's) identified in complete genomes and, especially, those ORF's that correspond to proteins with unknown function. The network described here has a simple hierarchical feed-forward topology and a limited number of neurons which makes it very fast. By using only information contained in 11 protein sequences, the method was able to identify, with 100% accuracy, all membrane proteins with reliable topologies collected from several papers in the literature. Applied to a test set of 995 globular, water-soluble proteins, the neural network classified falsely 23 of them in the membrane protein class (97.7% of correct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · RNA and protein synthesis mechanisms · Protein Structure and Dynamics
