# Highly Recurrent Multinucleotide Mutations in SARS-CoV-2

**Authors:** Nicola De Maio, Olivier Anoufa, Kyle Smith, Yatish Turakhia, Nick Goldman

PMC · DOI: 10.1093/molbev/msaf272 · Molecular Biology and Evolution · 2025-10-24

## TL;DR

This paper reveals that certain multinucleotide mutations in SARS-CoV-2 occur repeatedly and are linked to transcription regulatory sequences, affecting genome analysis and evolution.

## Contribution

The discovery of highly recurrent multinucleotide mutations in SARS-CoV-2 and their connection to transcription regulatory sequences is novel.

## Key findings

- Certain multinucleotide mutations in SARS-CoV-2 are highly recurrent and occur hundreds of times across lineages.
- Most of these mutations are linked to transcription regulatory sequences and may result from template switching during transcription.
- These mutations cause approximately 12% of false positives in recombination inference in SARS-CoV-2.

## Abstract

Multinucleotide mutations simultaneously replace multiple nucleotides. They are a significant contributor to evolution and disease, as well as to misdiagnosis, misannotation and other biases in genome data analysis. Multinucleotide mutations are generally thought to be rare and random events. However, by processing over 2 million publicly shared genomes, we show that certain multinucleotide mutations are highly recurrent in SARS-CoV-2: they repeatedly and consistently modify the same multiple nucleotides at the same genome position in the same way. The most frequent of these multinucleotide mutations have independently occurred hundreds of times across all SARS-CoV-2 lineages. We find evidence that the vast majority of these recurrent multinucleotide mutations (14 out of 15, corresponding to 97.6% of all individual occurrences) are linked to transcription regulatory sequences. We propose a mechanism that can explain them through template switching as part of the natural transcription process of the virus. This previously unknown mutational pattern increases our understanding of the evolution of SARS-CoV-2 and potentially many other nidoviruses. It also has important consequences for computational evolutionary biology: we show that for example recurrent multinucleotide mutations cause approximately 12% of false positives during inference of recombination in SARS-CoV-2.

## Full-text entities

- **Species:** Severe acute respiratory syndrome coronavirus 2 (no rank) [taxon 2697049]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12619124/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12619124/full.md

## References

71 references — full list in the complete paper: https://tomesphere.com/paper/PMC12619124/full.md

---
Source: https://tomesphere.com/paper/PMC12619124