Variant-driven multi-wave pattern of COVID-19 via a Machine Learning analysis of spike protein mutations
Adele de Hoffer, Shahram Vatani, Corentin Cot, Giacomo Cacciapaglia,, Maria Luisa Chiusano, Andrea Cimarelli, Francesco Conventi, Antonio Giannini,, Stefan Hohenegger, Francesco Sannino

TL;DR
This study uses machine learning to analyze spike protein mutations in SARS-CoV-2, enabling unbiased detection of emerging variants, predicting pandemic waves, and serving as an early warning system for future outbreaks.
Contribution
It introduces a novel ML approach that identifies and tracks virus variants without prior knowledge, validated against genome-based methods, and predicts pandemic waves driven by new variants.
Findings
Each COVID-19 wave is driven by a new emerging variant.
The ML method detects variants when they are only 1% of sequences.
The approach acts as an early warning system for new variants.
Abstract
Applying a ML approach to the temporal variability of the Spike protein sequence enables us to identify, classify and track emerging virus variants. Our analysis is unbiased, in the sense that it does not require any prior knowledge of the variant characteristics, and our results are validated by other informed methods that define variants based on the complete genome. Furthermore, correlating persistent variants of our approach to epidemiological data, we discover that each new wave of the COVID-19 pandemic is driven and dominated by a new emerging variant. Our results are therefore indispensable for further studies on the evolution of SARS-CoV-2 and the prediction of evolutionary patterns that determine current and future mutations of the Spike proteins, as well as their diversification and persistence during the viral spread. Moreover, our ML algorithm works as an efficient early…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSARS-CoV-2 and COVID-19 Research · Genomics and Rare Diseases · vaccines and immunoinformatics approaches
