Complete genome sequence of three atypical diarrheagenic Escherichia coli O166:H15 strains from foodborne outbreaks
Akiko Kubomura, Kenichi Lee, Sunao Iyoda, Yukihiro Akeda

TL;DR
This paper reports the complete genome sequences of three unusual Escherichia coli O166:H15 strains linked to foodborne outbreaks in Japan.
Contribution
The study provides new genomic insights into atypical E. coli strains lacking common virulence factors.
Findings
The genome sequences of three E. coli O166:H15 isolates were fully characterized.
These strains were found to lack typical virulence factors seen in diarrheagenic E. coli.
The findings contribute to understanding the genomic basis of these atypical strains.
Abstract
Escherichia coli O166:H15 strains have been repeatedly isolated from foodborne outbreaks in Japan. These strains lack the major virulence factors typically associated with diarrheagenic E. coli. To characterize the genomic features of these atypical strains, we present the complete genome sequences of three E. coli O166:H15 isolates from foodborne outbreaks.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
| Isolate name | Isolation year | No. of patients identified with the strain | Location | Chromosome size (bp)(accession no.) | Plasmid size (bp) (accession no.) | No. of CDS | No. of rRNAs | No. of tRNAs | G + C content (%) | Short read | Long read | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No. of reads | Coverage | DRA accession no. | No. of reads | Coverage | N50 | DRA accession no. | ||||||||||
| JNE070796 | 2006 | 147 | Kumamoto | 5,051,314 | 138,061 ( | 5,108 | 22 | 92 | 51 | 2,116,954 | 108.2 |
| 16,016 | 46.0 | 23,721 |
|
| JNE21-003 | 2021 | 181 | Saitama | 5,079,930 | 1,399,961 ( | 4,943 | 22 | 91 | 51 | 2,057,706 | 105.4 |
| 47,985 | 152.1 | 25,040 |
|
| JNE21-009 | 2016 | 28 | Hyogo | 5,144,898 | 188,563 ( | 5,418 | 22 | 96 | 51 | 1,157,728 | 52.4 |
| 47,676 | 139.7 | 23,745 |
|
- —Health and Labour Sciences Research Grant
- —Health and Labour Sciences Reserach Grant
- —Health and Labour Sciences Research Grant
- —Japan Society for the Promotion of Sciencehttp://dx.doi.org/10.13039/501100001691
- —Japan Agency for Medical Research and Developmenthttp://dx.doi.org/10.13039/100009619
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEscherichia coli research studies · Probiotics and Fermented Foods · Bacteriophages and microbial interactions
ANNOUNCEMENT
Diarrheagenic Escherichia coli (DEC) is classified into five major pathotypes, including Shiga toxin-producing E. coli and enterotoxigenic E. coli (1). Each pathotype is characterized by specific virulence genes, such as eae (encoding intimin) and elt (encoding heat-labile toxin). However, in Japan, E. coli O166:H15 strains that lack these known virulence genes have been associated with several foodborne outbreaks over the past 30 years (2, 3). To investigate their genomic features and potential pathogenicity, we determined the complete genome sequences of three E. coli O166:H15 isolates.
These isolates were obtained from three independent foodborne outbreaks (Table 1). In all cases, identical strains were isolated from multiple patients, and no other pathogens were detected. On the basis of these findings, E. coli O166:H15 was determined to be the causative agent. The strains were recovered from clinical samples by local public health institutes and subsequently transferred to the National Institute of Infectious Diseases. Bacterial cultures were grown overnight in buffered peptone water (Nissui Pharmaceutical, Japan) at 37°C, and genomic DNA was extracted using the MagMax DNA Multi-Sample Ultra 2.0 Kit with the KingFisher Duo Prime system (Thermo Fisher Scientific, USA). PCR screening confirmed that none of the isolates possessed known DEC marker genes, including stx1/2, elt, estA1/2, invE, eae, aggR, and afaD (Table S1. https://doi.org/10.5281/zenodo.17090503). The subsequent genome analysis using VirulenceFinder (https://cge.food.dtu.dk/services/VirulenceFinder/) also did not detect any major virulence factors, including adhesins, invasion factors, or toxins. Among the three isolates, more than 200 core genome SNPs were identified.
For short-read sequencing, genomic DNA libraries were prepared using the QIAseq FX DNA Library Kit (Qiagen, Germany), followed by paired-end sequencing (2 × 300 bp) on the MiSeq platform (Illumina, USA). For long-read sequencing, libraries were prepared using the Rapid Barcoding Kit V14 (SQK-RBK114.24, Oxford Nanopore Technologies, UK) and sequenced on a MinION Mk1B using an R10.4.1 flow cell over a 72-h run. Basecalling was performed using Dorado v0.9.5 with a super-accurate model ([email protected]). Quality of the reads was assessed by SeqKit v2.10.0 (4), QUAST v5.2.0 (5), and CheckM v1.2.4 (6). Contamination and completeness calculated by CheckM were below 2% and above 99%, respectively, in all the isolates. To obtain complete genomes, hybrid assemblies were generated using Hybracter v0.11.0 (7) (JNE070796 and JNE21-003) or Autocycler v0.4.0 (8) (JNE21-009). Genomes assembled with Autocycler were subjected to short-read polishing by Polypolish v0.5.0 (9) and PyPolca v0.3.1 (10, 11) and long-read polishing by Medaka v0.2.0 (Oxford Nanopore). Genome annotation was carried out using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) v2025-05-06 (12), followed by manual curation. All analyses were performed using the default parameter settings.
Since we only used completely anonymized patient information, approval from our institutional ethics committee was not required.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Jesser KJ, Levy K. 2020. Updates on defining and detecting diarrheagenic Escherichia coli pathotypes. Curr Opin Infect Dis 33:372–380. doi:10.1097/QCO.000000000000066532773499 PMC 7819864 · doi ↗ · pubmed ↗
- 2Nishikawa Y, Zhou Z, Hase A, Ogasawara J, Kitase T, Abe N, Nakamura H, Wada T, Ishii E, Haruki K, Surveillance Team. 2002. Diarrheagenic Escherichia coli isolated from stools of sporadic cases of diarrheal illness in Osaka City, Japan between 1997 and 2000: prevalence of enteroaggregative E. coli heat-stable enterotoxin 1 gene-possessing E. coli. Jpn J Infect Dis 55:183–190. doi:10.7883/yoken.JJID.2002.18312606826 · doi ↗ · pubmed ↗
- 3Zhou Z, Ogasawara J, Nishikawa Y, Seto Y, Helander A, Hase A, Iritani N, Nakamura H, Arikawa K, Kai A, Kamata Y, Hoshi H, Haruki K. 2002. An outbreak of gastroenteritis in Osaka, Japan due to Escherichia coli serogroup O 166:H 15 that had a coding gene for enteroaggregative E. coli heat-stable enterotoxin 1 (EAST 1). Epidemiol Infect 128:363–371. doi:10.1017/s 095026880200699412113479 PMC 2869831 · doi ↗ · pubmed ↗
- 4Shen W, Le S, Li Y, Hu F. 2016. Seq Kit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. P Lo S One 11:e 0163962. doi:10.1371/journal.pone.016396227706213 PMC 5051824 · doi ↗ · pubmed ↗
- 5Gurevich A, Saveliev V, Vyahhi N, Tesler G. 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:1072–1075. doi:10.1093/bioinformatics/btt 08623422339 PMC 3624806 · doi ↗ · pubmed ↗
- 6Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. 2015. Check M: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25:1043–1055. doi:10.1101/gr.186072.11425977477 PMC 4484387 · doi ↗ · pubmed ↗
- 7Bouras G, Houtak G, Wick RR, Mallawaarachchi V, Roach MJ, Papudeshi B, Judd LM, Sheppard AE, Edwards RA, Vreugde S. 2024. Hybracter: enabling scalable, automated, complete and accurate bacterial genome assemblies. Microb Genom 10:001244. doi:10.1099/mgen.0.00124438717808 PMC 11165638 · doi ↗ · pubmed ↗
- 8Wick RR, Howden BP, Stinear TP. 2025. Autocycler: long-read consensus assembly for bacterial genomes. bio Rxiv. doi:10.1101/2025.05.12.653612 PMC 1246005540875535 · doi ↗ · pubmed ↗
