Protein Secondary Structure Prediction with Long Short Term Memory Networks
S{\o}ren Kaae S{\o}nderby, Ole Winther

TL;DR
This paper demonstrates that bidirectional LSTM networks outperform traditional methods in predicting protein secondary structures from amino acid sequences, achieving higher accuracy on the CB513 dataset.
Contribution
It introduces a bidirectional LSTM approach for protein secondary structure prediction, surpassing previous state-of-the-art methods in accuracy.
Findings
Achieved 0.674 accuracy on 8-class secondary structure prediction.
LSTM-based model outperforms traditional feed forward and SVM methods.
Model architecture includes feed forward networks between LSTM cells for potential improvements.
Abstract
Prediction of protein secondary structure from the amino acid sequence is a classical bioinformatics problem. Common methods use feed forward neural networks or SVMs combined with a sliding window, as these models does not naturally handle sequential data. Recurrent neural networks are an generalization of the feed forward neural network that naturally handle sequential data. We use a bidirectional recurrent neural network with long short term memory cells for prediction of secondary structure and evaluate using the CB513 dataset. On the secondary structure 8-class problem we report better performance (0.674) than state of the art (0.664). Our model includes feed forward networks between the long short term memory cells, a path that can be further explored.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Protein Structure and Dynamics · Genomics and Phylogenetic Studies
