# Mitigation and detection of putative microbial contaminant reads from long-read metagenomic datasets

**Authors:** Stefany Ayala-Montaño, Ayorinde O. Afolayan, Raisa Kociurzynski, Ulrike Loeber, Sandra Reuter

PMC · DOI: 10.1099/mgen.0.001609 · 2026-01-22

## TL;DR

This paper introduces a new method called 'Stop-Check-Go' to detect and reduce microbial contamination in long-read metagenomic datasets from neonatal samples.

## Contribution

The novel 'Stop-Check-Go' system improves decontamination for low-biomass clinical metagenomic samples using a combination of lab and bioinformatics approaches.

## Key findings

- Host DNA was reduced by an average of 76% using a lysis method.
- The 'Stop-Check-Go' system identified putative contaminants in nearly 60% of the dataset.
- Existing tools performed poorly on microbiologically negative patient samples.

## Abstract

Metagenomic sequencing of clinical samples has significantly enhanced our understanding of microbial communities. However, microbial contamination and host-derived DNA remain a major obstacle to accurate data interpretation. Here, we present a methodology called ‘Stop-Check-Go’ for detecting and mitigating contaminants in metagenomic datasets obtained from neonatal patient samples (nasal and rectal swabs). This method incorporates laboratory and bioinformatics work combining a prevalence method, coverage estimation and microbiological reports. We compared the ‘Stop-Check-Go’ decontamination system with other published decontamination tools and commonly found poor performance in decontaminating microbiologically negative patients (false positives). We emphasize that host DNA decreased by an average of 76% per sample using a lysis method and was further reduced during post-sequencing analysis. Microbial species were classified as putative contaminants and assigned to ‘Stop’ in nearly 60% of the dataset. The ‘Stop-Check-Go’ system was developed to address the specific need of decontaminating low-biomass samples, where existing tools primarily designed for short-read metagenomic data showed limited performance.

## Full-text entities

- **Diseases:** JSD (MESH:C537568)
- **Chemicals:** agar (MESH:D000362), PBS (MESH:D007854), water (MESH:D014867), saponin (MESH:D012503), BHI (-)
- **Species:** Staphylococcus sp. (species) [taxon 29387], Homo sapiens (human, species) [taxon 9606], Pseudomonas aeruginosa (species) [taxon 287], Klebsiella pneumoniae (species) [taxon 573], Klebsiella michiganensis (species) [taxon 1134687], Veillonella sp. (species) [taxon 1926307], Escherichia coli (E. coli, species) [taxon 562], Klebsiella oxytoca (species) [taxon 571], Mus musculus (house mouse, species) [taxon 10090], Enterococcus sp. (species) [taxon 35783], Staphylococcus aureus (species) [taxon 1280], Klebsiella sp. (species) [taxon 576], Klebsiella aerogenes (species) [taxon 548], Klebsiella variicola (species) [taxon 244366], Bacteria Latreille et al. 1825 (Bacteria stick insect, genus) [taxon 629395], Enterobacterales (order) [taxon 91347], Staphylococcus epidermidis (species) [taxon 1282], Streptococcus sp. (species) [taxon 1306]
- **Mutations:** T2T

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12828179/full.md

---
Source: https://tomesphere.com/paper/PMC12828179