# Amplidiff: an optimized amplicon sequencing approach to estimating lineage abundances in viral metagenomes

**Authors:** Jasper van Bemmelen, Davida S. Smyth, Jasmijn A. Baaijens

PMC · DOI: 10.1186/s12859-024-05735-4 · BMC Bioinformatics · 2024-03-23

## TL;DR

AmpliDiff is a tool that identifies and designs primers for highly discriminatory regions in viral genomes to accurately estimate lineage abundances in metagenomes.

## Contribution

AmpliDiff introduces a method to simultaneously identify informative genomic regions and design primers for accurate lineage abundance estimation in viral metagenomes.

## Key findings

- AmpliDiff achieves comparable accuracy to whole genome sequencing in estimating SARS-CoV-2 lineage abundances.
- The tool is robust against incomplete input data and primers remain effective for genomes sampled months later.
- AmpliDiff provides a cost-efficient alternative to whole genome sequencing for viral metagenome analysis.

## Abstract

Metagenomic profiling algorithms commonly rely on genomic differences between lineages, strains, or species to infer the relative abundances of sequences present in a sample. This observation plays an important role in the analysis of diverse microbial communities, where targeted sequencing of 16S and 18S rRNA, both well-known hypervariable genomic regions, have led to insights into microbial diversity and the discovery of novel organisms. However, the variable nature of discriminatory regions can also act as a double-edged sword, as the sought-after variability can make it difficult to design primers for their amplification through PCR. Moreover, the most variable regions are not necessarily the most informative regions for the purpose of differentiation; one should focus on regions that maximize the number of lineages that can be distinguished.

Here we present AmpliDiff, a computational tool that simultaneously finds highly discriminatory genomic regions in viral genomes of a single species, as well as primers allowing for the amplification of these regions. We show that regions and primers found by AmpliDiff can be used to accurately estimate relative abundances of SARS-CoV-2 lineages, for example in wastewater sequencing data. We obtain errors that are comparable with using whole genome information to estimate relative abundances. Furthermore, our results show that AmpliDiff is robust against incomplete input data and that primers designed by AmpliDiff also bind to genomes sampled months after the primers were selected.

With AmpliDiff we provide an effective, cost-efficient alternative to whole genome sequencing for estimating lineage abundances in viral metagenomes.

## Linked entities

- **Diseases:** SARS-CoV-2 (MONDO:0100096)

## Full-text entities

- **Species:** Severe acute respiratory syndrome coronavirus 2 (no rank) [taxon 2697049]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC10960382/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC10960382/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/PMC10960382/full.md

---
Source: https://tomesphere.com/paper/PMC10960382