Classification of Influenza Hemagglutinin Protein Sequences using Convolutional Neural Networks
Charalambos Chrysostomou, Floris Alexandrou, Mihalis A. Nicolaou and, Huseyin Seker

TL;DR
This study employs convolutional neural networks to classify influenza Hemagglutinin protein sequences by host type, achieving higher accuracy and more balanced results than previous methods, aiding in virus host prediction.
Contribution
The paper introduces a CNN-based approach using hydrophobicity index encoding for accurate influenza HA protein host classification, improving accuracy and balance over prior work.
Findings
Up to 10% higher accuracy for avian host classification.
More balanced accuracy across host classes.
Effective distinction of HA sequences for different hosts.
Abstract
The Influenza virus can be considered as one of the most severe viruses that can infect multiple species with often fatal consequences to the hosts. The Hemagglutinin (HA) gene of the virus can be a target for antiviral drug development realised through accurate identification of its sub-types and possible the targeted hosts. This paper focuses on accurately predicting if an Influenza type A virus can infect specific hosts, and more specifically, Human, Avian and Swine hosts, using only the protein sequence of the HA gene. In more detail, we propose encoding the protein sequences into numerical signals using the Hydrophobicity Index and subsequently utilising a Convolutional Neural Network-based predictive model. The Influenza HA protein sequences used in the proposed work are obtained from the Influenza Research Database (IRD). Specifically, complete and unique HA protein sequences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfluenza Virus Research Studies · Machine Learning in Bioinformatics · Respiratory viral infections research
