Integrating genomic epidemiology and deep mutational scanning data for prevalence forecasting of SARS-CoV-2 Omicron lineages
Zhong-yi Lei, Xiao-min Zhang, Jia-lu Han, Ji-guo Xue, Jia-yi Xu, Zi-lin Ren, Yi-gang Tong, Xiao-chen Bo, Ming Ni

TL;DR
This paper introduces CoVPF, a model that combines genomic data and mutation effects to better predict the spread of SARS-CoV-2 Omicron variants.
Contribution
The novel integration of genomic epidemiology and deep mutational scanning data, with emphasis on epistasis, improves lineage prevalence forecasting.
Findings
CoVPF achieved 20.7% higher accuracy in predicting lineage prevalence compared to previous models.
Ignoring epistasis reduced forecasting accuracy by 43%, highlighting its importance.
CoVPF provided more accurate and timely forecasts for lineage expansions like EG.5.1 and XBB.1.5.
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) continuously circulates and the Omicron variants have mutated into over 2,500 lineages, predicting ensuing prevalent lineages and inflections of dominant lineages is of public health significance and study interest. Previous study has integrated genome to forecast lineage prevalence, yet overlooked the functional aspects of mutations; efforts to evaluate the functional effects of individual mutations have not extended to the lineage level. Here, we propose CoVPF, a model integrating both genomic epidemiology and deep mutational scanning (DMS) data for the receptor binding domain (RBD) of SARS-CoV-2 spike protein, to predict the prevalence of Omicron lineages. Retrospective validation demonstrated that CoVPF achieved 20.7% higher accuracy compared to previous study. Furthermore, we found that accounting for epistasis was…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSARS-CoV-2 and COVID-19 Research · vaccines and immunoinformatics approaches · COVID-19 Clinical Research Studies
