TL;DR
This paper introduces a statistical mixture model to accurately estimate SARS-CoV-2 variant proportions in pooled samples, aiding real-time genomic surveillance of evolving virus strains.
Contribution
The paper presents a novel mixture model that effectively estimates SARS-CoV-2 variant frequencies from pooled sequencing data, supporting both raw reads and predefined markers.
Findings
Model accurately recovers variant proportions in simulated data.
Method aligns well with epidemiological data from wastewater samples.
Supports both raw sequencing reads and VCF format markers.
Abstract
Despite of the fast development of highly effective vaccines to control the current COVID19 pandemic, the unequal distribution and availability of these vaccines worldwide and the number of people infected in the world lead to the continuous emergence of SARS-CoV-2 (Severe Acute Respiratory Syndrome coronavirus 2) variants of concern. It is likely that real-time genomic surveillance will be continuously needed as an unceasing monitoring tool, necessary to follow the spillover of the disease spread and the evolution of the virus. In this context, new genomic variants of SARS-CoV-2 that may emerge as a response to selective pressure, including variants refractory to current vaccines, makes genomic surveillance programs tools of utmost importance. Here propose a statistical model for the estimation of the relative frequencies of SARS-CoV-2 variants in pooled samples. This model is built…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
