# CholeraSeq: a comprehensive genomic pipeline for cholera surveillance and near real-time outbreak investigation

**Authors:** Massimiliano S Tagliamonte, Abhinav Sharma, Alberto Riva, Monika Moir, Marco Salemi, Cheryl Baxter, Tulio de Oliveira, Carla N Mavian, Eduan Wilkinson

PMC · DOI: 10.1093/bioinformatics/btaf665 · Bioinformatics · 2025-12-16

## TL;DR

CholeraSeq is a genomic pipeline for cholera surveillance that streamlines outbreak investigations using next-generation sequencing data.

## Contribution

CholeraSeq introduces a unified, reproducible pipeline for cholera genomics, including a curated core genome alignment for rapid outbreak analysis.

## Key findings

- CholeraSeq integrates multiple genomic analysis steps into a single pipeline for cholera surveillance.
- The pipeline supports various input types and scales across computing environments.
- A curated core genome alignment is provided for fast epidemiological placement of new strains.

## Abstract

Next Generation Sequencing is widely deployed in cholera-endemic regions, yet an end-to-end reproducible pipeline that unifies read QC, filtering, reference mapping, variant calling/annotation, recombination screening, and extraction of parsimony informative sites/variant codons, phylogenetic inference for downstream phylodynamic and epidemiological analyses have been lacking, slowing outbreak investigation and public health response. CholeraSeq is a high-throughput genomics pipeline for cholera genomic surveillance. It ingests consensus genomes, short read sequence data, draft assemblies, and scales seamlessly from local to cloud environments. To accelerate epidemiological context placement of new outbreak strains, we provide a curated ready-to-use core genome alignment compiled from public data, enabling flexible, fast, integration of new samples for outbreak investigations.

CholeraSeq is freely available on the GitHub platform https://github.com/CERI-KRISP/CholeraSeq. CholeraSeq is implemented in Nextflow with a modular design building upon the nf-core community standards.

## Linked entities

- **Diseases:** cholera (MONDO:0015766)

## Full-text entities

- **Diseases:** cholera (MESH:D002771)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12790814/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12790814/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/PMC12790814/full.md

---
Source: https://tomesphere.com/paper/PMC12790814