CoViT: Real-time phylogenetics for the SARS-CoV-2 pandemic using Vision Transformers
Zuher Jahshan, Can Alkan, Leonid Yavits

TL;DR
CoViT leverages Vision Transformers to rapidly and accurately classify and place SARS-CoV-2 genomes within the phylogenetic tree, significantly speeding up pandemic tracking.
Contribution
This work introduces CoViT, a novel neural network approach using Vision Transformers for real-time viral genome classification and phylogenetic placement.
Findings
Achieves 94.2% accuracy in genome placement
Provides top-two placement correctness of 97.9%
Classifies genomes in 0.055 seconds on GPU
Abstract
Real-time viral genome detection, taxonomic classification and phylogenetic analysis are critical for efficient tracking and control of viral pandemics such as Covid-19. However, the unprecedented and still growing amounts of viral genome data create a computational bottleneck, which effectively prevents the real-time pandemic tracking. For genomic tracing to work effectively, each new viral genome sequence must be placed in its pangenomic context. Re-inferring the full phylogeny of SARS-CoV-2, with datasets containing millions of samples, is prohibitively slow even using powerful computational resources. We are attempting to alleviate the computational bottleneck by modifying and applying Vision Transformer, a recently developed neural network model for image recognition, to taxonomic classification and placement of viral genomes, such as SARS-CoV-2. Our solution, CoViT, places…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Machine Learning in Bioinformatics · Cell Image Analysis Techniques
MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Label Smoothing · Softmax · Adam · Position-Wise Feed-Forward Layer · Layer Normalization · Byte Pair Encoding · Residual Connection
