Statistical challenges in the analysis of sequence and structure data   for the COVID-19 spike protein

Shiyu He; Samuel W.K. Wong

arXiv:2101.02304·stat.AP·February 2, 2021

Statistical challenges in the analysis of sequence and structure data for the COVID-19 spike protein

Shiyu He, Samuel W.K. Wong

PDF

Open Access

TL;DR

This paper develops statistical models to analyze the evolution and structural changes of the SARS-CoV-2 spike protein, revealing how certain mutation combinations may spread more rapidly.

Contribution

It introduces Bayesian hierarchical models and sampling methods to study the temporal, spatial, and structural dynamics of spike protein mutations.

Findings

01

D614G variants are spreading widely.

02

Co-occurring mutations D614G with S477N or A222V spread faster.

03

Structural analysis suggests mutation impacts on 3-D conformation.

Abstract

As the major target of many vaccines and neutralizing antibodies against SARS-CoV-2, the spike (S) protein is observed to mutate over time. In this paper, we present statistical approaches to tackle some challenges associated with the analysis of S-protein data. We build a Bayesian hierarchical model to study the temporal and spatial evolution of S-protein sequences, after grouping the sequences into representative clusters. We then apply sampling methods to investigate possible changes to the S-protein's 3-D structure as a result of commonly observed mutations. While the increasing spread of D614G variants has been noted in other research, our results also show that the co-occurring mutations of D614G together with S477N or A222V may spread even more rapidly, as quantified by our model estimates.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSARS-CoV-2 and COVID-19 Research · vaccines and immunoinformatics approaches · Influenza Virus Research Studies