NEMESISdb: A full length 16S rRNA gene dataset for the detection of human, fish, and crustacean potentially pathogenic bacteria
Son-Hoang Tran, Claudia Ximena Restrepo-Ortiz, Dinh Quang Vu, Marc Troussellier, Yvan Bettarel, Thierry Bouvier, Van Ngoc Bui, Nguyen Hieu Minh, Trung Du Hoang, Quang Huy Nguyen, Jean-Christophe Auguet

TL;DR
NEMESISdb is a curated dataset of full-length 16S rRNA gene sequences for identifying potentially pathogenic bacteria in humans, fish, and crustaceans.
Contribution
NEMESISdb introduces a comprehensive, curated 16S rRNA dataset focused on marine and coastal pathogenic bacteria for human, fish, and crustacean hosts.
Findings
NEMESISdb includes over 150,000 curated 16S rRNA sequences for 1703 human, 222 fish, and 64 crustacean pathogenic bacteria species.
The dataset is optimized for use with BLAST and classifier tools for accurate detection in metagenomic and metabarcoding studies.
NEMESISdb supports One Health research by linking pathogen circulation across environmental, animal, and human systems.
Abstract
NEMESISdb is a 16S rRNA full length sequence curated dataset designed to enable the identification and tracking of potentially pathogenic bacteria (PPB) for human, fish, and crustacean hosts. It addresses the limited focus on marine and coastal environments as key reservoirs for PPB, where bacteria from diverse sources—terrestrial, marine, and animal—can coexist. Leveraging recent advances in high-throughput sequencing, NEMESISdb provides a robust resource for the detection of PPB in 16S rRNA gene metabarcoding or metagenomic data. The database comprises three datasets corresponding to human, fish, and crustacean hosts, containing 1703, 222, and 64 PPB species, respectively, with a total of over 150,000 16S rRNA full length sequences curated for accuracy. This resource was constructed by extracting sequences from the SILVA 138.2 SSU Ref NR99 database, refining them through a rigorous…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVibrio bacteria research studies · Aquaculture disease management and microbiota · Identification and Quantification in Food
