Stochastic Mutation as a Mechanism for the Emergence of SARS-CoV-2 New Variants
Liaofu Luo, Jun Lv

TL;DR
This study models SARS-CoV-2 evolution using stochastic methods and cladogram algorithms to predict new variants and macro-lineage emergence based on spike protein mutations.
Contribution
It introduces a stochastic mutation generation approach combined with cladogram analysis to forecast SARS-CoV-2 macro-lineage emergence.
Findings
Emergence of new strains is primarily influenced by the number of randomly generated mutation sites.
Large-scale stochastic sampling reveals shifts in macro-lineage dominance.
Threshold mutation site counts predict transitions between macro-lineages.
Abstract
Predicting the future evolutionary trajectory of SARS-CoV-2 remains a critical challenge, particularly due to the pivotal role of spike protein mutations. It is therefore essential to develop evolutionary models capable of continuously integrating new experimental data. In this study, we employ a cladogram algorithm that incorporates established assumptions for mutant representation -- using both four-letter and two-letter formats -- along with an n-mer distance algorithm to construct a cladogenetic tree of SARS-CoV-2 mutations. This tree accurately captures the observed changes across macro-lineages. We introduce a stochastic method for generating new strains on this tree based on spike protein mutations. For a given set A of existing mutation sites, we define a set X comprising x randomly generated mutation sites on the spike protein. The intersection of A and X, denoted as set Y,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSARS-CoV-2 and COVID-19 Research
MethodsLinear Regression · Sparse Evolutionary Training
