The complete genome sequencing of Escherichia coli isolated from a patient who visited Pediatrics People’s Hospital of Pingguo, China
Lifeng Chen, Zeeshan Umar

TL;DR
This paper reports the full genome sequence of an Escherichia coli strain isolated from a pediatric patient in China.
Contribution
The study provides a complete genome sequence of a specific E. coli isolate from a clinical sample in China.
Findings
The genome of E. coli isolate ACESH02881hy is 5,071,463 base pairs in size.
The isolate was obtained from a patient at the Pediatrics People’s Hospital of Pingguo in 2021.
Abstract
In this article, we present a comprehensive analysis of the genome sequence of Escherichia coli isolate ACESH02881hy, which has a 5,071,463-bp genome size. The strain was isolated from patient who visited the Pediatrics People’s Hospital of Pingguo, China, 2021.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
| Feature | ||
|---|---|---|
| Long read | Number of reads | 99,551.0 |
| Total bases | 1,173,847,871.0 | |
| Mean read length (bp) | 11,791.4 | |
| Mean read quality | 13.7 | |
| Median read length (bp) | 10,761.0 | |
| Median read quality | 14.2 | |
| N50 read length (bp) | 12,979.0 | |
| Short read | Total bases | 11,554,920 |
| Raw base (Mb) | 1,006 | |
| Raw read1 Q20 (%) | 97.67 | |
| Raw read2 Q20 (%) | 96.53 | |
| Clean base (Mb) | 1,006 | |
| Clean read1 Q20 (%) | 97.67 | |
| Clean read2 Q20 (%) | 96.53 | |
| GC (%) | 47.89 | |
| Assembly | Genome size (bp) | 5,071,463 |
| Long read coverage (×) | 470 | |
| Short-read coverage (×) | 200 | |
| GC content (%) | 50.6 | |
| N50 value | 853,191 | |
| L50 value | 2 | |
| Resistance gene (antimicrobial) | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAntibiotic Resistance in Bacteria · Escherichia coli research studies · Urinary Tract Infections Management
ANNOUNCEMENT
Enterotoxigenic Escherichia coli (ETEC) is an important enteric pathogen that causes a significant number of cases of diarrhea annually, reaching into the tens of millions. Children under 5 years of age are highly susceptible to ETEC infection, especially in regions where the disease is prevalent. In the year 2015, ETEC was accountable for an estimated 100 million instances of diarrhea, resulting in approximately 60,000 deaths (1). This manuscript presents the genome of an ETEC strain obtained from a patient who visited Pediatrics People’s Hospital. The stool sample (1.0 g) was collected and then aerobically cultured overnight at 37°C on MacConkey agar. Next day, colonies with distinct morphologies were streaked to obtain a pure culture in order to identify the bacteria using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (Bruker, Bremen, Germany), accepting score values between 2.3 and 3.0 (2). Then, the DNA was extracted following the manufacturer’s instruction through SteadyPure Bacteria Genomic DNA Extraction Kit (AG) from a culture grown overnight at 37°C, and evaluated through NanoDrop 2000 UV-visible (UV-Vis) spectrophotometer (Thermo Scientific, USA) and a Qubit version 2.0 fluorometer (Thermo Scientific) (2). The same DNA was then sequenced on Illumina, Novogene Bioinformatics Technology (Beijing, China), and Oxford Nanopore. For the short-read sequencing, DNA was fragmented by sonication to a size of 350 bp and subjected to sequencing using the Illumina novaseq 6000 platform through Nextera XT Kit (Illumina, USA). Then A-tailed fragments were then ligated with paired-end adaptors and amplified through PCR using a 500-bp insert. The PCR products were then purified through the AMPure XP system (USA), and the quality of the library was evaluated using the Agilent 5400 system (USA) and quantified through qPCR (1.5 nM). Fastp (0.23.1) was used to analyze short-read sequencing quality by discarding paired reads with adapter contamination, followed by over 10% unknown bases or more than 50% low-quality bases (Phred quality < 5) at the machine’s default parameters except where otherwise noted. For long reads, the DNA library was generated without sheering the DNA either mechanically or enzymatically through a Ligation Sequencing Kit (SQK-LSK114). Subsequently, raw reads from the PromethION were initially assessed using the MinKNOW software for real-time Quality Control (QC) metrics and processed with the Oxford Nanopore Technologies Guppy software (version 0.17.1) in high-accuracy mode to enhance base calling.
While Unicycler version 0.4.7 was used for the hybrid assembly (3), the complete genome sequence of the bacteria was then submitted to NCBI for annotation and bioinformatics analysis by NCBI Prokaryotic Genome Annotation Pipeline (6.6) (4, 5). Acquired antibiotic resistance genes in L2890hy was identified by Res-Finder 4.1. Related metrics are listed in Table 1.
The complete genome size of E. coli ACESH02881hy is 5,071,463 bp. The Guanine-Cytosine (GC) content was approximately 50.6% with an L50 value of 2. Antimicrobial resistance gene analysis showed that the whole genome contained many antimicrobial resistance genes, including blaTEM-1B and blaNDM-1 (beta-lactam).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Zhang Y, Tan P, Zhao Y, Ma X. 2022. Enterotoxigenic Escherichia coli: intestinal pathogenesis mechanisms and colonization resistance by gut microbiota. Gut Microbes 14:2055943. doi:10.1080/19490976.2022.205594335358002 PMC 8973357 · doi ↗ · pubmed ↗
- 2Zheng B, Xu H, Lv T, Guo L, Xiao Y, Huang C, Zhang S, Chen Y, Han H, Shen P, Xiao Y, Li L. 2020. Stool samples of acute diarrhea inpatients as a reservoir of ST 11 hypervirulent KPC-2-producing Klebsiella pneumoniae. m Systems 5:e 00498-20. doi:10.1128/m Systems.00498-2032576652 PMC 7311318 · doi ↗ · pubmed ↗
- 3Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. P Lo S Comput Biol 13:e 1005595. doi:10.1371/journal.pcbi.100559528594827 PMC 5481147 · doi ↗ · pubmed ↗
- 4Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068–2069. doi:10.1093/bioinformatics/btu 15324642063 · doi ↗ · pubmed ↗
- 5Li W, O’Neill KR, Haft DH, Di Cuccio M, Chetvernin V, Badretdin A, Coulouris G, Chitsaz F, Derbyshire MK, Durkin AS, Gonzales NR, Gwadz M, Lanczycki CJ, Song JS, Thanki N, Wang J, Yamashita RA, Yang M, Zheng C, Marchler-Bauer A, Thibaud-Nissen F. 2021. Ref Seq: expanding the prokaryotic genome annotation pipeline reach with protein family model curation. Nucleic Acids Res 49:D 1020–D 1028. doi:10.1093/nar/gkaa 110533270901 PMC 7779008 · doi ↗ · pubmed ↗
