Deciphering Escherichia coli ESBL/pAmpC Plasmids Through High-Throughput Third-Generation Sequencing and Hybrid Assembly
Andrea Laconi, Enea Ovedani, Roberta Tolosi, Ilias Apostolakos, Alessandra Piccirillo

TL;DR
This study uses hybrid sequencing to better understand and track antibiotic resistance plasmids in E. coli from broilers.
Contribution
The study demonstrates that hybrid sequencing improves plasmid reconstruction and detection of resistance-associated mobile genetic elements.
Findings
Hybrid assemblies produced the most accurate and complete plasmid reconstructions.
Long and hybrid assemblies detected IS26 and virulence genes missed by short reads.
ARG profiles were consistent across methods, but hybrid assemblies provided better structural resolution.
Abstract
Extended-spectrum β-lactamases (ESBLs) and plasmid-mediated AmpC (pAmpC) β-lactamases represent a threat for public health. Their dissemination is often mediated by mobile genetic elements (MGEs), but plasmid identification and characterization could be hindered by sequencing limitations. Hybrid assembly may overcome these barriers. Eight ESBL/pAmpC-producing E. coli isolates from broilers were sequenced using Illumina (short-read) and Oxford Nanopore MinION (long-read). Assemblies were generated individually and using a hybrid approach. Plasmids were typed, annotated, and screened for antimicrobial resistance genes (ARGs), MGEs, and virulence factors. Short-read assemblies were highly fragmented, while long reads improved contiguity but showed typing errors. Hybrid assemblies produced the most accurate and complete plasmids, including more circularized plasmids. Long and hybrid…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsbiodegradable polymer synthesis and properties · Orthopaedic implants and arthroplasty
1. Introduction
Bacterial resistance to antimicrobial agents is a major threat to public health [1]. Cephalosporins, a class of β-lactam antibiotics, are widely used in both human and veterinary medicine. Third-generation cephalosporins (3GCs) have particular therapeutic value in human medicine and represent one of the limited treatment options for infections caused by multidrug-resistant Enterobacteriaceae [2]. Resistance to these agents is frequently mediated by extended-spectrum β-lactamases (ESBLs) and AmpC β-lactamases (AmpCs). The genes encoding these enzymes are often located on plasmids and associated with mobile genetic elements (MGEs), such as transposons (Tns) and insertion sequences (ISs) [3,4]. Consequently, horizontal gene transfer (HGT) via mobilization or conjugation plays a major role in the dissemination of β-lactam resistance across different strains and genera of the Enterobacteriaceae family [5]. Among them, Escherichia coli is a key player in the spread of antimicrobial resistance (AMR), as it readily acquires and transfers AMR genes (ARGs) to other bacterial species [6]. Thus, genetic characterization of plasmids and their associated MGEs in E. coli is essential for understanding the molecular mechanisms driving AMR dissemination [6].
Whole-genome sequencing (WGS) provides a powerful tool for analyzing bacterial genomes and investigating ARGs [7]. Illumina short-read sequencing remains the most widely used technology in microbial genomics due to its high accuracy and throughput [8]. However, reconstructing plasmids from short reads alone is challenging, as repetitive regions often exceed the read length and typical paired-end insert sizes (~300–500 bp), which prevents complete plasmid assembly and hinders accurate contextualization of ARGs [9,10]. In contrast, long-read sequencing platforms such as Oxford Nanopore Technologies (ONT) MinION generate reads of 8–10 kb or longer, which can span repetitive regions and improve plasmid reconstruction. Despite their higher error rates (5–15%), ongoing technological advances are steadily reducing these limitations. Hybrid assembly, which combines the structural resolving power of long reads with the base-level accuracy of short reads, has emerged as a promising approach for generating accurate and contiguous plasmid assemblies [10].
This study aimed to assess whether hybrid assembly enhances the characterization of ESBL/pAmpC-carrying plasmids in E. coli isolated from the broiler production pyramid compared to short- or long-read sequencing alone. In particular, we evaluated the ability of each approach to resolve plasmid structures, identify co-localized ARGs, characterize MGEs, and detect virulence genes.
2. Materials and Methods
2.1. Bacterial Isolates
The eight ESBL/pAmpC-producing E. coli strains included in this study (Table 1) were isolated from three production chains (A, B, and C) of an integrated broiler company in Northern Italy [11]. Screening for ESBL/pAmpC was performed on Eosin Methylene Blue agar (Microbiol, Cagliari, Italy) supplemented with 1 mg/L cefotaxime (CTX-EMB) and incubated at 37 ± 0.5 °C for 20 ± 2 h. ESBL/pAmpC production was confirmed by the double-disk synergy test according to CLSI guidelines [12]. Isolates characterization (i.e., phylogroups and sequence typing) and detection of ESBL/pAmpC resistance genes were performed by multiplex PCR [11]. A subset of isolates, selected on the basis of production chain, production stage, ESBL/pAmpC gene, and phylogroup combinations, was sequenced using Illumina short-read [13].
2.2. Library Preparation and Whole-Genome Sequencing
In a previous study [13], bacterial DNA was extracted using the Invisorb Spin Tissue Mini Kit (Invitek, Berlin, Germany), libraries were prepared using the Nextera XT library preparation kit (Illumina, San Diego, CA, USA), and sequencing was performed on an Illumina HiSeqX platform with 2 × 150 bp paired-end reads (Macrogen, Seoul, Republic of Korea). In the present study, bacterial DNA was extracted with the QIAprep Spin Miniprep Kit (Qiagen, Hamburg, Germany) and long-read sequencing was performed on an ONT MinION platform using the Rapid sequencing gDNA barcoding kit (SQK-RBK110.96) (ONT, Oxford, UK) and R9.4.1 flow cells (ONT, Oxford, UK).
2.3. Raw Reads Quality Control
Illumina read quality was assessed using Falco (v1.2.4) [14]. Adapters and low-quality reads (Phred < 25) were removed using Trimmomatic (v0.39) [15] with default parameters. MinION reads were evaluated with NanoPlot (v1.44.1) [16] and filtered (Q > 8) with Nanofilt (v2.3.0) [17]. A graphical representation of the entire workflow is depicted in Figure 1.
2.4. Bacterial and Plasmid Whole-Genome Assembly
Short-reads assemblies were generated using Unicycler (v0.5.1) [18], meanwhile MinION long-read sequences were assembled using Flye (v2.9.6) [19]. Hybrid assemblies integrating both short- and long-read data were obtained using Unicycler (v0.5.1). General assembly statistics and quality metrics of the assembled genomes were calculated using QUAST (v5.3.0) [20].
2.5. Identification, Annotation, and Characterization of ESBL/pAmpC-Carrying Plasmids
MOB-Recon (v3.1.9) [21] was used to reconstruct and type individual plasmid sequences from short-, long-read, and hybrid assemblies. Plasmids harboring ESBL/pAmpC genes were characterized using pMLST 2.0 (v0.1.0) with default parameters [22] and annotated with Prokka (v1.14.6) [23]. ISs and Tns were identified using MGE (v1.14.6) with default parameters [24].
2.6. Resistance and Virulence Genes Detection
ARGs located on ESBL/pAmpC-carrying plasmids and other plasmids were identified using ResFinder (v4.7.2) with 90% identity threshold and 60% coverage [25]. Virulence genes were detected using MGE (v1.14.6) [24].
2.7. Statistical Analysis
Differences in assembly performance (e.g., number of contigs, total length, N50, number of genetic features extracted) across short-read, long-read, and hybrid assemblies were assessed using the non-parametric Kruskal–Wallis test followed by Dunn’s post hoc test using GraphPad Prism (v10.5.0). Significant differences were set at a p-values < 0.05.
3. Results
3.1. Basic Statistics of Short and Long Reads
Statistics for both Illumina and MinION reads after trimming and quality filtering are reported in Table 2.
EC-33 and EC-91 showed higher long-read coverage compared to the other genomes. Meanwhile, short-read coverage was comparable among strains. Illumina reads yielded significantly higher number of reads (p = 0.0002), total base pair (bp) (p = 0.01), and coverage (p = 0.01) compared to long-reads (Figure 2A–C).
3.2. Performances of Short, Long, and Hybrid Assemblies
An overview of the comparison of the genome assembly performances of short-, long-read, and hybrid assemblies is reported in Figure 3.
Illumina short-reads produced more fragmented assemblies (higher number of contigs) compared to MinION long-reads and hybrid assemblies (p < 0.05; Figure 3A). Total assembly length (bp) was comparable among the three approaches (Figure 3B), whereas N50 values were higher for long-read and hybrid assemblies than for short-read assemblies (Figure 3C).
3.3. Identification, Annotation and Characterization of ESBL/pAmpC-Carrying Plasmids in Short, Long, and Hybrid Assemblies
MOB-Recon detected a similar number of plasmids across assemblies (Figure 4A) and correctly identified the plasmid carrying the ESBL/pAmpC genes for each strain across the three assembly approaches.
Some differences among the assemblies were observed. Illumina assemblies produced a higher number of contigs compared to MinION (p = 0.033) and hybrid (p = 0.016) assemblies (Figure 4B), indicating that Illumina reads generated more fragmented plasmid sequences. Accordingly, N50 values were significantly higher for MinION (p = 0.044) and hybrid (p = 0.048) assemblies than for Illumina assemblies (Figure 4C). Lower contig numbers and higher N50 values corresponded to larger average plasmid sizes, although the differences were not statistically significant (p > 0.05; Figure 4D). While short- and long-read assemblies produced only one and two ESBL/pAmpC circular plasmids, respectively, four ESBL/pAmpC plasmids were fully circularized in hybrid assemblies (Table 3).
Seven different Inc replicons were identified across the eight plasmids (Table 3). MinION and hybrid assemblies detected all replicons, whereas Illumina reads failed to identify the IncFII replicon in the plasmid from strain EC-94. pMLST analysis was performed for replicons IncA, IncI1, and IncHI2. MinION assemblies performed the least effectively, failing to assign a plasmid sequence type (pST) for EC-94 and producing mismatches or gaps for the other plasmids. pMLST profiles matched between Illumina and hybrid assemblies, except for the plasmid in EC-40; Illumina classified it as pST-26 (clonal complex CC2), whereas hybrid assemblies identified it as a new pST assigned to CC-26. In both cases, pMLST alleles were perfectly matched to known references. Across all assemblies, 1296 coding DNA sequences (CDSs), including putative and hypothetical proteins, were identified among the eight plasmids. Annotated CDSs from Illumina and hybrid assemblies showed strong overlap (Figure 5), while MinION assemblies produced divergent results.
When putative and hypothetical proteins were excluded, short-, long-, and hybrid assemblies yielded a comparable number of CDSs (Figure 4E). Long assemblies showed higher performances in identifying IS and Tn sequences compared to Illumina reads (mean = 6.13 vs. mean = 1.5, and p = 0.042, Figure 4F). Furthermore, long-read and hybrid assemblies overperformed short-reads in characterizing the relationship between MGEs and ESBL/pAmpC genes. Indeed, while Illumina identified only three ISs (i.e., ISEc9 and IS26) and one Tn (Tn2) associated with ESBL/pAmpC genes, MinON and hybrid identified six ISs (i.e., ISEc9, IS26, and IS102), one Tn (Tn2), and one composite Tn (cn_5325_IS26) (Table 4).
A map of the ESBL-carrying plasmid harbored by strain EC-78, based on sequences and annotations obtained using the hybrid assembly approach, is shown in Figure 6.
3.4. Identification of Acquired Antimicrobial Resistance Genes and Virulence Factors in Short, Long and Hybrid Assemblies
Aside from genes against 3GCs, 25 different ARGs conferring resistance to eight antimicrobial classes (i.e., aminoglycosides, β-lactams, (fluoro)quinolones, macrolides, phenicol, sulphonamides, tetracyclines, and trimethoprim) were identified by the three assemblies across the eight ESBL/pAmpC carrying plasmids. According to all three assemblies, five out of eight plasmids showed a multi-resistant profile, carrying genes conferring resistance to three or more antimicrobial classes. Detail of ARGs identified by each assembly for each plasmid are reported in Table 4. Six known virulence genes (i.e., cib, terC, traJ, traT, astA, and anr) distributed among four ESBL/pAmpC plasmids were identified by long and hybrid assemblies, while short read detected only four.
3.5. Characterization of Other Plasmids in Short, Long and Hybrid Assemblies
Plasmids other than those carrying the ESBL/pAmpC genes were identified in all the strains but one (i.e., EC-91). The average plasmids size was comparable among the three assemblies (Figure S1A); however, hybrid assemblies yielded a higher number of circular plasmids compared to short- (p = 0.184) and long- (p = 0.129) read assemblies (Figure S1B). Nine different plasmid types (i.e., Col(MG828), ColpVC, ColRNAI, IncFIA, IncFIB, InCFIC, InchHI1B, IncHI2A, and IncI-gamma/K1) were identified across the short, long, and hybrid assemblies, with the latest showing the best typing performances (65.38%, 95% Confidence Interval (CI) 45.79–84.98% vs. 44.12%, 95% CI 26.53–61.70% and 55.56%, 95% CI 35.52–75.59%, respectively), even though the differences among assemblies were not significant (Figure S1C). In total 15 different ARGs belonging to seven different antimicrobial classes (i.e., aminoglycosides, β-lactams, macrolides, phenicol, sulphonamides, tetracyclines, and trimethoprim) were identified among the non-ESBL/pAmpC plasmids across the three assembly methods. Plasmids harbored by strains EC-56 and EC-94 showed multi-resistance profile. While long assembly identified a lower number of ARGs compared to the other two methods, no significant differences were observed (Figure S1D). Long and hybrid assemblies identified a higher number of ISs in non-ESBL/pAmpC plasmids (Figure S1E) compared to Illumina reads; indeed, while short-read assemblies detected 0.9 ISs per plasmid, this ratio increased to 1.48 and 1.85 for long and hybrid assemblies, respectively. Similarly, long and hybrid assemblies identified more virulence genes (n = 51 and n = 47, respectively) compared to short-read assembly (n = 40) (Figure S1F).
4. Discussion
The present study focused on the characterization of ESBL/pAmpC-carrying plasmids in E. coli strains isolated from the broiler production pyramid, using short-, long-, and hybrid-read assembly approaches. Our results, consistent with previous observations, demonstrate that the hybrid assembly strategy outperformed the exclusive use of either Illumina (short-read) or MinION (long-read) data in resolving plasmid structures and enabling detailed characterization [8,10,26]. Illumina assemblies produced a higher number of contigs and lower N50 values than MinION and hybrid assemblies, resulting in more fragmented and smaller plasmids. Consequently, short-reads yielded fewer circularized plasmids, both ESBL/pAmpC-carrying and non-ESBL/pAmpC, compared with the other two approaches, particularly the hybrid assembly approach. These findings reflect a known limitation of Illumina sequencing; short read lengths cannot span repeated elements commonly found in plasmids (e.g., ISs and Tns), making plasmid reconstruction more difficult and less accurate [27]. Long-read assemblies overcame this limitation but highlighted the lower sensitivity and higher error rate of nanopore sequencing. Specifically, long-read assemblies performed poorly in plasmid typing, producing mismatches and gaps against reference sequences and failing to type most plasmids, suggesting that nanopore data alone may not be optimal for plasmid typing or single-variant analysis [28]. However, in this study, long-reads assembly correctly classified two closely related genes of the bla_CTX-M_ group (i.e., bla_CTX-M-1_ and bla_CTX-M-15_), which shared a nucleotide sequence similarity of about 98.7% [29]. In hybrid assemblies, Illumina data corrected MinION errors, while preserving the comprehensive detection of ISs and Tns. As a result, hybrid assemblies achieved typing accuracy comparable to Illumina while resolving plasmid structures more effectively and producing the highest number of circularized plasmids. This enhanced resolution enabled the characterization of gene clusters carrying resistance determinants and transfer systems, highlighting potential recombination mechanisms mediated by insertion sequences and other transposable elements [30]). Long-reads and hybrid assembly enabled to identify the association between three different ESBL genes (i.e., bla_SHV-12_, bla_CTX-M-1_, and bla_CTX-M-2_) and IS26, which would have been missed if using Illumina sequencing alone. Because IS26 plays a key role in the mobility of ESBL genes between plasmids and from plasmids to the chromosome [31], the absence of this information can hinder our understanding of ESBL gene transmission across humans, animals, and the environment. Indeed, a recent study combining the genetic analysis of more than 2500 plasmid sequences with in vitro inter-plasmid antibiotic resistance gene transfer experiments reported that IS26 is likely to accelerate ARG dissemination among different bacterial species [32]. Furthermore, IS26-mediated transposition activity of bla_KPC-2_ seems to play a key role in the emergence of carbapenem resistance in Klebsiella pneumoniae [33]. Therefore, the identification and characterization if ISs, in particular IS26, are essential for implementing effective control measures and limiting the dissemination of critically important ARGs. Although dual sequencing increases costs, from an epidemiological and surveillance perspective, this added resolution justifies the use of hybrid assemblies.
The three approaches identified similar, if not identical, ARG profiles across the eight ESBL/pAmpC plasmids, indicating that hybrid assembly does not markedly improve resistance gene detection, in contrast to some previous observations [34].
VF gene prediction was comparable between long-read and hybrid assemblies, whereas Illumina identified fewer VF genes in both ESBL/pAmpC and non-ESBL/pAmpC plasmids. Previous studies have reported conflicting results when comparing short- and long-read sequencing for virulence gene detection [27,35]. Such discrepancies may be attributable to differences in library preparation, assembly, or bioinformatic tools and should be considered when comparing results across studies.
In conclusion, this study shows that hybrid assembly is a powerful approach in bacterial genomics. It provides a high-resolution view of plasmid architecture, improves the detection of resistance genes and their associated MGEs, and yields additional insights beyond those obtained from short- or long-read assemblies alone. Routine implementation of hybrid assemblies could generate essential knowledge to support targeted surveillance and intervention strategies against antimicrobial resistance.
5. Conclusions
This study highlights the advantages of hybrid assembly in resolving the structure and genetic context of ESBL/pAmpC-carrying plasmids from E. coli in the broiler production chain. By integrating short- and long-read sequencing, hybrid assemblies overcome the limitations of individual platforms, enabling more accurate plasmid reconstruction and the detection of key mobile genetic elements, such as IS26, involved in resistance gene dissemination. Although resistance gene profiles were broadly similar across approaches, the enhanced structural resolution provided by hybrid assemblies offers valuable insights into plasmid dynamics. Routine application of this strategy can strengthen antimicrobial resistance surveillance and guide more effective interventions within a One Health framework.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Li T. Wang Z. Guo J. de la Fuente-Nunez C. Wang J. Han B. Tao H. Liu J. Wang X. Bacterial Resistance to Antibacterial Agents: Mechanisms, Control Strategies, and Implications for Global Health Sci. Total Environ.202286016046110.1016/j.scitotenv.2022.16046136435256 PMC 11537282 · doi ↗ · pubmed ↗
- 2Moghnieh R. Estaitieh N. Mugharbil A. Jisr T. Abdallah D.I. Ziade F. Sinno L. Ibrahim A. Third Generation Cephalosporin Resistant Enterobacteriaceae and Multidrug Resistant Gram-Negative Bacteria Causing Bacteremia in Febrile Neutropenia Adult Cancer Patients in Lebanon, Broad Spectrum Antibiotics Use as a Major Risk Factor, and Correlation with Poor Prognosis Front. Cell. Infect. Microbiol.201551110.3389/fcimb.2015.0001125729741 PMC 4325930 · doi ↗ · pubmed ↗
- 3Bonnet R. Growing Group of Extended-Spectrum β-Lactamases: The CTX-M Enzymes Antimicrob. Agents Chemother.20044811410.1128/AAC.48.1.1-14.200414693512 PMC 310187 · doi ↗ · pubmed ↗
- 4Carattoli A. Plasmids and the Spread of Resistance Int. J. Med. Microbiol.201330329830410.1016/j.ijmm.2013.02.00123499304 · doi ↗ · pubmed ↗
- 5Liebana E. Carattoli A. Coque T.M. Hasman H. Magiorakos A.-P. Mevius D. Peixe L. Poirel L. Schuepbach-Regula G. Torneke K. Public Health Risks of Enterobacterial Isolates Producing Extended-Spectrum β-Lactamases or Amp C β-Lactamases in Food and Food-Producing Animals: An EU Perspective of Epidemiology, Analytical Methods, Risk Factors, and Control Options Clin. Infect. Dis.2013561030103710.1093/cid/cis 104323243183 · doi ↗ · pubmed ↗
- 6Trongjit S. Chuanchuen R. Whole Genome Sequencing and Characteristics of Escherichia coli with Co-Existence of ESBL and mcr Genes from Pigs P Lo S ONE 202116 e 026001110.1371/journal.pone.026001134784400 PMC 8594834 · doi ↗ · pubmed ↗
- 7Bonvegna M. Tomassone L. Christensen H. Olsen J.E. Whole Genome Sequencing (WGS) Analysis of Virulence and AMR Genes in Extended-Spectrum β-Lactamase (ESBL)-Producing Escherichia coli from Animal and Environmental Samples in Four Italian Swine Farms Antibiotics 202211177410.3390/antibiotics 1112177436551431 PMC 9774568 · doi ↗ · pubmed ↗
- 8De Maio N. Shaw L.P. Hubbard A. George S. Sanderson N.D. Swann J. Wick R. Abu Oun M. Stubberfield E. Hoosdally S.J. Comparison of Long-Read Sequencing Technologies in the Hybrid Assembly of Complex Bacterial Genomes Microb. Genom.20195 e 00029410.1099/mgen.0.00029431483244 PMC 6807382 · doi ↗ · pubmed ↗
