A Fusion-Driven Approach of Attention-Based CNN-BiLSTM for Protein Family Classification -- ProFamNet
Bahar Ali, Anwar Shah, Malik Niaz, Musadaq Mansoord, Sami Ullah,, Muhammad Adnan

TL;DR
ProFamNet is a novel fusion model combining CNN, BiLSTM, and attention for protein family classification, achieving higher accuracy and efficiency with fewer parameters and training epochs.
Contribution
The paper introduces ProFamNet, a new fusion-based neural network architecture that outperforms existing models in protein family classification tasks.
Findings
ProFamNet achieved a 98.30% F1 score, surpassing previous models.
The model is more parameter-efficient with only 450,953 parameters.
Faster training convergence with fewer epochs.
Abstract
Advanced automated AI techniques allow us to classify protein sequences and discern their biological families and functions. Conventional approaches for classifying these protein families often focus on extracting N-Gram features from the sequences while overlooking crucial motif information and the interplay between motifs and neighboring amino acids. Recently, convolutional neural networks have been applied to amino acid and motif data, even with a limited dataset of well-characterized proteins, resulting in improved performance. This study presents a model for classifying protein families using the fusion of 1D-CNN, BiLSTM, and an attention mechanism, which combines spatial feature extraction, long-term dependencies, and context-aware representations. The proposed model (ProFamNet) achieved superior model efficiency with 450,953 parameters and a compact size of 1.72 MB, outperforming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Gene expression and cancer classification · Genetics, Bioinformatics, and Biomedical Research
MethodsSoftmax · Attention Is All You Need · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Bidirectional LSTM · Focus
