# CRESSENT: a bioinformatics toolkit to explore and improve ssDNA virus annotation

**Authors:** Ricardo R. Pavan, Matthew B. Sullivan, Michael J. Tisza

PMC · DOI: 10.1099/mgen.0.001632 · Microbial Genomics · 2026-02-05

## TL;DR

CRESSENT is a new toolkit for analyzing and improving the annotation of single-stranded DNA viruses in metagenomic data.

## Contribution

CRESSENT introduces a modular pipeline for ssDNA virus annotation, including decontamination and structure prediction.

## Key findings

- CRESSENT efficiently processes ssDNA virus datasets using standard computing resources.
- The toolkit includes modules for decontamination, phylogenetic analysis, and recombination detection.
- CRESSENT enables systematic inclusion of ssDNA viruses in viromics workflows.

## Abstract

ssDNA viruses are important components of diverse ecosystems; however, it remains challenging to systematically identify and classify them. This is partly due to their broad host range and resulting genomic diversity, structure and rapid evolutionary rates. In addition, distinguishing genuine ssDNA genomes from contaminating sequences in metagenomic datasets (e.g. from commercial kits) has been an unresolved issue for years. Here, we present CRESSENT (CRESS-DNA Extended aNnotation Toolkit), a comprehensive and modular bioinformatic pipeline focused on ssDNA virus ‘genome-to-analysis’ and annotation. The pipeline integrates multiple functionalities organized into several modules: sequence dereplication, decontamination, phylogenetic analysis, motif discovery, stem-loop structure prediction and recombination detection. Each module can be used independently or in combination with others, allowing researchers to customize their analysis workflow. With this tool, researchers can comprehensively and systematically include ssDNA viruses in their viromics workflows and facilitate comparative genomic studies, which are often limited to dsDNA viruses, therefore leaving behind a crucial component of the microbiome community under study. Benchmarking analyses demonstrated that CRESSENT efficiently processes ssDNA virus datasets of varying scales, completing small family-level analyses within minutes and moderate comparative genomics studies within hours using standard computing resources. Its modular, parallelized design ensures scalability and low memory usage, making it accessible to research groups with diverse computational capacities.

## Full-text entities

- **Diseases:** CRESSENT (MESH:C564021), CRESS (MESH:D053842), MSL (MESH:C000722848)
- **Species:** DNA viruses [taxon 2080735], Faba bean necrotic yellows virus (no rank) [taxon 59817]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12877143/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12877143/full.md

## References

61 references — full list in the complete paper: https://tomesphere.com/paper/PMC12877143/full.md

---
Source: https://tomesphere.com/paper/PMC12877143