Using Deep Learning Sequence Models to Identify SARS-CoV-2 Divergence
Yanyi Ding, Zhiyi Kuang, Yuxin Pei, Jeff Tan, Ziyu Zhang, Joseph Konan

TL;DR
This paper presents a neural network model using deep learning sequence techniques to classify SARS-CoV-2 strains from amino acid sequences, aiding rapid divergence detection and potentially improving genomic analysis efficiency.
Contribution
It introduces a novel neural network combining recurrent and convolutional units for classifying SARS-CoV-2 clades directly from spike protein sequences, offering a more efficient alternative to existing methods.
Findings
Model achieves high classification accuracy.
Outperforms BERT-based models in efficiency.
Provides a scalable approach for genomic divergence detection.
Abstract
SARS-CoV-2 is an upper respiratory system RNA virus that has caused over 3 million deaths and infecting over 150 million worldwide as of May 2021. With thousands of strains sequenced to date, SARS-CoV-2 mutations pose significant challenges to scientists on keeping pace with vaccine development and public health measures. Therefore, an efficient method of identifying the divergence of lab samples from patients would greatly aid the documentation of SARS-CoV-2 genomics. In this study, we propose a neural network model that leverages recurrent and convolutional units to directly take in amino acid sequences of spike proteins and classify corresponding clades. We also compared our model's performance with Bidirectional Encoder Representations from Transformers (BERT) pre-trained on protein database. Our approach has the potential of providing a more computationally efficient alternative to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Genomics and Phylogenetic Studies · vaccines and immunoinformatics approaches
