Prediction of Antibiotic Resistance Genes in Cyanobacterial Strains by Whole Genome Sequencing
Duarte Balata, Tânia Rosado, Francisco Pina-Martins, Vera Manageiro, Carina Menezes, Eugénia Ferreira, Octávio S. Paulo, Manuela Caniça, Elsa Dias

TL;DR
This study uses whole genome sequencing to identify antibiotic resistance genes in cyanobacteria from freshwater, revealing their potential role in spreading antibiotic resistance.
Contribution
The novel contribution is the detection of antibiotic resistance gene variants in cyanobacterial genomes and evidence of potential horizontal gene transfer.
Findings
Four antibiotic resistance gene variants conferring resistance to tetracyclines, fluoroquinolones, and macrolides were detected in cyanobacterial genomes.
The adeF-type resistance gene was most prevalent, found in 11 cyanobacterial genomes from the Nostocales order.
Planktothrix cyanobacteria showed the highest variety of macrolide resistance gene matches.
Abstract
Cyanobacteria are ubiquitous in freshwater environments, but their role in aquatic resistome remains unclear. In this work, we performed whole genome sequencing on 43 cyanobacterial strains isolated from Portuguese fresh/wastewaters. From 43 available non-axenic unicyanoabacterial cultures (containing only one cyanobacterial strain and their co-occurring bacteria), it was possible to recover 41 cyanobacterial genomes from the genomic assemblies using a genome binning software, 26 of which were classified as high-quality based on completeness, contamination, N50 and contig number thresholds. By using the comprehensive antibiotic resistance database (CARD) on the assembled samples, we detected four antibiotic resistance gene (ARG) variants, conferring resistance in pathogenic bacteria to tetracyclines, fluoroquinolones (adeF-type) and macrolides (ermF-type, mefC-type and mphG-type). Among…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6- —Portuguese Foundation for Science and Technology
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPharmaceutical and Antibiotic Environmental Impacts · Antibiotic Resistance in Bacteria · Environmental DNA in Biodiversity Studies
1. Introduction
In recent decades, due to the overconsumption and misuse of antibiotics, as well as the lack of new available antibiotics, bacteria’s resistances are starting to catch up on the substances available to control their nefarious impacts [1]. Antibiotic resistance (AR) is one of the biggest threats to global health, and it is rising dangerously [2] and driving us to a post-antibiotic era, that is, to the resistance era [3,4].
AR should not be considered only from the clinical perspective, as there is a risk of not seeing the “whole picture”. It has been argued that the discussion about AR must include evolutionary and ecological perspectives given the role of environmental microbes as sources and dissemination vehicles of AR [4]. Water compartments, particularly, are important reservoirs of resistant microorganisms (both native and pathogenic bacteria) as well as pools of antibiotic resistance genes (ARGs) [5,6,7].
Even though it is recognized, the aquatic resistome is still under investigation, namely some bacterial groups that might play an important role as reservoirs of ARGs, such as cyanobacteria [8]. Indeed, in previous studies, we showed that non-axenic cyanobacterial strains from distinct genera exhibit a non-susceptible phenotype to antibiotics that are common in freshwater, such as trimethoprim and nalidixic acid [9,10]. Additionally, we identified a class-1-type integron, sul1, strA-strB and qacΔE genes in fresh- and wastewater cyanobacterial strains (non-axenic), supporting the hypothesis that cyanobacteria can acquire and transfer AR determinants [10]. In these studies, ARGs were detected through PCR amplification, using primers and PCR conditions designed for bacteria, considering that no specific PCR conditions and primers were established for cyanobacteria. Thus, it remained unclear whether these genes were from cyanobacteria and/or whether they belonged to their associated bacteria. Indeed, the isolation and purification of cyanobacteria in cultures is a difficult task, and most strains in culture collections are non-axenic, containing associated heterotrophic bacteria, similarly to what happens in the environment [11]. In fact, the importance of these bacteria for the long-term culturing of cyanobacteria is a matter of discussion.
Other recent studies have shown that cyanobacterial blooms may influence the composition of the antibiotic-resistant bacterial community in freshwater reservoirs, thus indirectly affecting the ARG profiles in these environments [12,13]. In addition, it was also demonstrated that cyanobacteria-associated eDNA carrying ARGs may be relevant for the acquisition and dissemination of ARGs in aquatic environments [14]. The same authors [15] also used qPCR to detect genes conferring resistance to major classes of antibiotics commonly administrated to humans and animals, namely sulfonamides (sul1, sul2), tetracycline (tetA, tetB) and quinolones (qnrB), in bacteria-free cyanobacterial strains from eutrophic lakes in China. The results from these studies strongly support the hypothesis that cyanobacterial blooms may play a role in the aquatic resistome. However, it still remains to be demonstrated whether the ARGs present in cyanobacterial blooms are indeed present in cyanobacterial genomes. Meanwhile, the distribution and diversity of ARGs in 862 publicly available cyanobacterial genomes (up to 2021) was reported in 2023 [16]. These authors found distinct ARGs in 10% of these genomes, providing another clue to clarify the role of aquatic and terrestrial cyanobacteria in the environmental resistome.
Cyanobacteria have been found to display highly plastic and variable genomes due to the process of horizontal gene transfer (HGT) [17]. This process is common in cyanobacteria and plays an important role in their evolution [18]. HGT allows bacteria to acquire exogenous genes from mobile genetic elements (MGEs) that are not present in their clonal lineage. It is therefore plausible to suppose that HGT can facilitate the transfer of antibiotic resistance coding sequences between cyanobacteria cells and pathogenic bacteria present in the freshwater bodies, where cyanobacteria are highly abundant. The persistent ARGs found in the environment are currently considered as emerging environmental pollutants and the increased morbidity associated with their dissemination has become a global concern for public health [19].
Despite this, cyanobacteria are a highly understudied group of microorganisms in what concerns their antibiotic resistance profile. This is probably because cyanobacteria are considered non-pathogenic to humans and difficult to isolate into pure cultures.
In the continuity of our above-mentioned studies [9,10], here, we describe a genome-sequencing approach to predict ARGs in 43 cyanobacteria strains that were previously evaluated for their antibiotic susceptibility phenotype and, for some of them, their antibiotic-resistant genotype. These strains belong to species of distinct orders, namely Chroococcales (e.g., Microcystis aeruginosa), Nostocales (e.g., Anabaena spp., Aphanizomenon spp.) and Oscillatoriales (e.g., Planktothrix spp.), which are highly abundant in Portuguese freshwater environments. These strains are deposited in the ESSACC [20] and LEGE-CC ([21,22]; https://lege.ciimar.up.pt/, accessed on 21 May 2025) culture collections and were previously isolated from surface freshwater and from a wastewater treatment plant, respectively.
This study’s goal is to improve the understanding of the importance of cyanobacteria as potential ARG reservoirs in freshwater environments through the whole genome sequencing (WGS) of non-axenic/unicyanobacterial cultures of freshwater cyanobacteria from different species. We developed a pipeline for this type of approach that is easily scalable and reproducible, as well as user-friendly, in order to be implemented and expanded in future analysis.
2. Materials and Methods
2.1. Cyanobacterial Strains
Forty-three cyanobacterial strains previously isolated from Portuguese surface freshwaters and from a wastewater treatment plant (WWTP) were studied (Supplementary Figure S1). Among these strains, 31 were isolated from 20 surface freshwater reservoirs, 4 from 2 rivers and 8 from a WWTP secondary decanter tank. The freshwater strains were isolated between 1996 and 2015, being maintained since then at the “Estela Sousa e Silva Algae Culture Collection” (Laboratory of Biology and Ecotoxicology, National Institute of Health_INSA, Lisbon, Portugal) [20]. The wastewater strains were isolated between 2006 and 2007 and belong to the “LEGE Culture Collection” (Laboratory of Ecotoxicology, Genomics and Evolution, Interdisciplinary Centre of Marine and Environmental Research_CIIMAR, Porto, Portugal) [21,22] (https://lege.ciimar.up.pt/, accessed on 21 May 2025).
Non-axenic/unicyanobacterial cultures of these strains were successfully maintained in laboratory culture conditions, namely in Z8 liquid medium [23] under a 14/10 h or 12 h/12 h light/dark cycle (L/D; light intensity 16 ± 4 μEm^−2^ s^−1^, approx.) at 20 ± 1 °C [10]. Cultures of Anabaena spp. (9 strains), Aphanizomenon spp. (9 strains), Microcystis spp. (9 strains) and Planktothrix spp. (16 strains) were prepared for further DNA extraction. The taxonomy of these strains was previously determined (at the isolation date) by microscopy according to the morphometric features and identification keys described in EN 15204:2006 [24] (Supplementary Table S1). As explained in the Section 3, some of the species were re-identified after analyzing the sequencing data obtained in the present work and confirming their morphological features. In addition, reclassification also took into account the revisions that were carried out by taxonomists for some species/genera.
2.2. DNA Extraction and Whole Genome Sequencing
Genomic DNA was extracted from 15 mL of cultures of the above-mentioned 43 cyanobacterial strains. Cells were harvested by centrifugation (SORVALL RC-5C Plus, Kendro Laboratory Products, Asheville, NC, USA) at 4000× g for 25 min, and DNA was extracted using a PowerWater^®^ DNA Isolation Kit (MoBio, Carlsbad, CA, USA) according to the manufacturer’s instructions. The DNA samples were analyzed by gel electrophoresis (0.8% agarose, w/v) and quantified using a Qubit^®^ 3.0 fluorometer (Thermo Fisher Scientific, Waltham, MA, USA).
Sequencing libraries were prepared with a Nextera XT DNA Sample Preparation Kit (Nextera XT DNA Library Prep Kit Reference Guide 15031942v03, February 2018, Illumina) according to the manufacturer’s instructions. Briefly, the genomic DNA extracted from each cyanobacteria culture (0.2 ng/µL) was tagmented at 55 °C for 5 min. Illumina index sequences (Nextera XT Index Kit, FC-131-1001, Illumina, San Diego, CA, USA) were added to each tagmented DNA sample by PCR.
The library DNA fragments were size-selected and purified using AMPure XP beads (Beckman Coulter, Inc.), and the quality control, including the size and concentration of each library, was estimated on a Fragment Analyzer (HS NGS Fragment Kit 1-6000 bp, DNF-474, Agilent, Santa Clara, CA, USA).
Libraries were normalized and pooled to 4 nM. Pooled libraries were denatured and diluted (Denature and Dilute Libraries Guide, Document # 15,039,740 v10, February 2019, Illumina) to a final concentration of 15 pM/20 pM (V2/V3 Illumina sequencing kit (MS-103-1003/MS-102-3003), respectively) with 1% of Phix (FC-110-3001, Illumina) as sequencing control. Whole genome sequencing was carried out using a 2 × 250 paired-end (PE) reads configuration on a high-throughput Illumina MiSeq platform in the Technology and Innovation Unit—Department of Human Genetics (UTI-DGH) facilities from INSA. Image analysis and base calling were conducted by MiSeq Control Software (MCS) directly on the MiSeq instrument (Illumina, San Diego, CA, USA).
The use of non-axenic/unicyanobacterial cultures means that each cyanobacteria culture contains not only a single cyanobacterial strain but also bacteria that were present in the water samples from which the cyanobacteria were isolated and that were co-cultured with the cyanobacterial strain. In this sense, the total DNA obtained from each cyanobacteria culture corresponds, indeed, to a metagenome. In this work, we aimed to recover cyanobacterial genomes from the correspondent metagenomes, and therefore, these are henceforth called “genomes” and not “metagenomes”.
2.3. Data Analysis
2.3.1. Sequence Filtering
The raw Illumina MiSeq output files in fastq format were analyzed using FastQC version 0.11.8. This software provided insights on read quality scores, per-base sequence content and per-sequence GC content. Anomalous base content regions in the beginning and trailing portion of the reads, resulting from the presence of Illumina sequencing adapters, were trimmed utilizing Cutadapt version 1.18.
Reads were also filtered by quality score using BBTools suite’s BBDuk version 38.32. Every read with an average Phred quality score below 33 was discarded from the analysis. BBDuk was also used to create interleaved fastq files from the original Illumina paired read format file, which were needed in downstream software.
2.3.2. Taxonomy Classification
Sequences were taxonomically classified using Kraken2 [25] version 2.0.7 and the SILVA 16S database [26], which, at the time of this work, presented information on a higher variety of cyanobacteria genera than the Refseq genomic database.
2.3.3. Genome Assembly and Genome Binning
The reads were assembled with SPAdes [27] version 3.13.1, using careful mode with kmer lengths of 21, 33, 55, 77, 99 and 127.
The assembled genome was separated into bins corresponding to the multiple species present in the communities, using MaxBin2 [28] version 2.2.6. This software was chosen because it can bin contigs as short as 500 bp, being therefore suited for lower coverage/more fragmented genomes. Using this genomic binning software, each genome was binned into 2 to 12 genomic bins, each corresponding to a taxonomic group, based on their sequencing depth and GC base content. Since this work’s main goal was to recover cyanobacteria genomes, the bins containing cyanobacteria sequences were identified amongst the remaining bacteria by comparing their genome characteristics (length, GC%) to reference genomes of the same genera, as well as querying the obtained sequences against reference databases using BLAST [29] version 2.11.0.
An in-house Python script named pick_bin.py (commit ‘b36d096’) was made to access the summary files produced by MaxBin2 and automatically select the bin that is most similar to the provided cyanobacterial reference. This selection was performed based on a comparison of the weighted arithmetic means of GC percentage and genome size between the bin and reference genome. A weight of 90% was attributed to the GC content parameter since it works as an identifier for the genera and is less subject to being compromised by the lack of data for some genomic regions. The remaining 10% was attributed to the length of the genome since it is considerably more variable than GC% and highly subject to bias when the whole genome is present for sequencing reasons. Despite this, this parameter is important, mostly to discard very small bins (likely to be binning artifacts) or when dealing with species with close GC contents.
The bins picked by pick_bin.py were analyzed with QUAST [30] version 5.0.2 in order to create a genome assembly quality report. This software presented many insightful assembly statistics, such as genome length, N50, GC content and number of contigs per assembly. N50 measures assembly quality in terms of contiguity and is defined as the sequence length of the shortest contig at 50% of the total genome length.
The multiple generated bins were then classified using Kraken2 again to attest for pick_bin.py’s bin selection and to obtain a sense of the taxonomy of the non-cyanobacterial bins. In all instances of Kraken2′s use, the Pavian online tool version 1.0 was used to obtain an interactive table visualization of the attributed taxonomy labels and well as Sankey plots.
Some of the reads were then mapped to their corresponding SPAdes assembled genome using BWA [31] version 0.7.12, and the resulting alignment SAM files were sorted and indexed using Samtools [32] version 1.9. These SAM alignments were visualized using Tablet [33] version 1.19.05.28 in order to explore the contig regions of previously PCR-amplified genes from our strains and confirm the effectiveness of the utilized SPAdes assembling parameters by checking the contiguity of previously known sequences obtained from the PCR sequencing of our samples.
CheckM [34] version 1.1.3 was used to analyze the binned genomes and calculate their estimated levels of completeness and contamination.
Raw sequence reads used for the assembly were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive under accession numbers SAMN34233939 to SAMN34233979 (Bioproject ref. PRJNA956929).
2.3.4. Antibiotic Resistance Gene Prediction
The prediction of ARGs was performed using the Resistance Gene Identifier tool, available at the CARD website [35,36], using CARD version 3.0.3 with the “Perfect, Strict and Loose hits” criteria, and it nudged >95% Loose hits to Strict. The option “High quality/coverage” was chosen for samples presenting a genomic coverage greater than 10×, while “Low quality/coverage” was used for the remaining samples.
Antibiotic resistance hits were qualified using the following CARD criteria: Perfect hits mean that a region from the query genome is exactly the same as the reference ARG, having 100% of its length and 100% alignment identity. Strict hits are sequences almost the same as that of the reference gene in either length or identity, but only partially similar in the other parameter. Proteins encoded by ‘Strict hit’ genes present a high probably of maintaining the AR trait displayed by the reference gene according to the CARD prediction algorithm. Loose hits correspond to genes that present some percentage of similarity with the ARGs from the CARD, but might not necessarily display the same function. Loose hits can help with the detection of emergent threats or distant homologs of known ARGs [36].
2.4. New Bioinformatics Pipeline
In order to automate the analysis and make it less time consuming and easily reproducible, we developed a new pipeline using GNU Make, Shell scripting and Python3. This CyanoPipeline (Figure 1; commit ‘f1cd1c4’) receives three inputs: the Illumina-produced fastq files, a reference genome of the species of interest in fasta format and a taxonomy database for Kraken2, consisting of four files (database.kdb, database.idx, taxonomy/nodes.dmp, taxonomy/names.dmp), obtained using the ‘kraken-build’ command. From these three inputs, it automatically performs all the previously described analyses (Section 2.3), giving the user the possibility to easily change some variable analysis parameters from the Makefile, namely, the read trimming length, the read quality score threshold and the number of CPU threads to use. All other parameters can also be altered easily by accessing each software’s corresponding shell scripts.
3. Results
3.1. Genome Assembly
The obtained assembly lengths for the 43 genomes ranged from 5,461,759 to 54,668,933 bp, presenting significant heterogeneity, probably due to the use of different types of flow cells during strain sequencing. As for the number of contiguous sequences (contigs) of lengths over 500 bp, generated per assembly, their quantities ranged from 1653 to 19,547. It is noticeable that samples containing Microcystis strains generally presented the lowest number of contigs (3565) when compared to other sample sets, presenting a mean number per sample of approximately 42% of the average global value (8460). On the other hand, Planktothrix mougeotii samples displayed a relatively high number of contigs when compared to others, with roughly 182% of the global average (numbers averaging 15,472 per sample).
After processing the raw reads, Kraken2 was used, along with the SILVA 16S database, to attribute taxonomic labels, at the genus level, to the reads present in each of the genomic assemblies. Using Pavian to filter the results to display only cyanobacterial matches, we obtained the results shown in Supplementary Table S2. A cyanobacterial sample was considered likely to belong to the genus to which it presents the highest matching percentage attributed by Kraken2. The matching percentage refers to the percentage of query reads that present the highest identity score with a given reference data set.
Complete metagenomic classifications of all samples are represented using Sankey diagrams in Supplementary Figure S2. Apart from the cyanobacteria, the remaining bacteria on the samples belonged primarily to the phyla Firmicutes, Proteobacteria and Bacteroidetes.
When restricting the outputs to the phylum Cyanobacteria, the Anabaena samples LMECYA 123C and LMECYA 161 showed the greatest correspondence with the genus Dolichospermum (78.58% and 85.36% of the reads, respectively). LMECYA 165, LMECYA 182 and LMECYA 204 were identified as Sphaerospermopsis, with 71.24%, 55.02% and 75.4% of their reads matching this genus, respectively. LMECYA 213 and LMECYA 313 matched with Aphanizomenon (70.14% and 23.75%, respectively), and LMECYA 246 with Anabaena, with 65.5% of the reads (Supplementary Table S2.1).
Aphanizomenon samples LMECYA 009, LMECYA 040, LMECYA 191, LMECYA 237 and LMECYA 328 were identified as belonging to the genus Aphanizomenon, with 37.36%, 51.12%, 48.68%, 56.65% and 73.17% of their reads matching the reference sequence, respectively. LMECYA 031 and LMECYA 190 showed the greatest correspondence with Cuspidothrix (48.54% and 52.04%, respectively). LMECYA 089 and LMECYA 253 were attributed to the genus Dolichospermum, with 46.12% and 45.29% matching (Supplementary Table S2.2).
The Kraken2 identification of all Microcystis samples was in conformity with the previous morphological and morphometric identification, attributing all samples to the genus Microcystis. These samples had between 46.84% and 69.47% of their classified reads attributed to this genus (Supplementary Table S2.3).
Every P. agardhii sample presented the highest matching percentage with the genus Planktothrix, as initially expected, except for LMECYA 303. The percentages of bases attributed to this genus were 85.92% to 90.55% (Supplementary Table S2.4).
Similarly, every P. mougeotti sample displayed the highest percentage of matching reads with the genus Planktothrix, ranging from 73.12% to 86.08% (Supplementary Table S2.5).
Considering the differences between the original taxonomic classifications of the cyanobacterial strains of Anabaena and Aphanizomenon based on morphology and the genomic data, we reclassified these strains, as presented in Supplementary Table S1. Besides the morphologic/morphometric and genomic data, we also took into account the taxonomic revisions that were carried out for the Nostocales order.
Indeed, the taxonomic identification of cyanobacteria has been historically performed through optical microscopy, according to the morphometric and morphologic features and identification keys described in EN 15204:2006. This is still a widely used method in phytoplankton and cyanobacterial bloom monitoring in freshwater reservoirs [37] and was the method used to identify the strains in this work, as referred in Section 2.1. However, ambiguity regarding several species/genera identifications and incongruencies revealed by phylogenetic information has led to taxonomic revisions in the last two decades, and polyphasic approaches, combining morphologic, molecular and ecologic criteria, have been proposed as the more adequate cyanobacteria identification procedure [38]. The order Nostocales, in particular the genera Anabaena and Aphanizomenon, has suffered many taxonomic revisions. For example, the benthic Anabaena morphotypes were maintained within the Anabaena genus, while its planktonic morphotypes were reclassified as Dolichospermum [39]. However, while some authors adopted the new nomenclature, others still maintained the old one for several years [40]. Another example is the Aphanizomenon issatschenkoi species, which was reclassified into Cuspidothrix issatchenkoi [41], but maintained this double nomenclature for several years [42].
In the present study, we found that some Nostocales LMECYA strains had not yet been reclassified, so the results of the genome analysis did not coincide with the initial cyanobacterial classification. This by itself is an outcome of the work, enabling the correct identification of LMECYA strains through the analysis of genomes and further confirmed by the observation of their morphological features.
3.2. Recovered Cyanobacterial Genomes
Out of the 43 sequenced genomes, it was possible to recover cyanobacteria assemblies from 41 of them. The length and GC content parameters of the recovered cyanobacteria genomes are represented on Figure 2 and are close to the previously reported values for Aphanizomenon spp. (5.187 Mb/37.65%), Anabaena spp. (5.306 Mb/38.38%), Planktothrix agardhii (5.512 Mb/39.50%) and Planktothrix mougeotii (5.499 Mb/39.5%) [43,44,45,46]. For Microcystis spp. samples, the GC base % was close to the available Microcystis aeruginosa reference genome, but the lengths on our assemblies (average 3.648 Mb [±0.917 Mb]) were smaller (5.2 Mb) [47]. This can be related to the lower output of the Illumina Nano flow cell used in the sequencing of our Microcystis spp. strains, as it was not capable of producing reads across the entire length of the cyanobacterial genome.
Genome quality was assessed using the CheckM software, which estimates completeness and contamination based on lineage-specific marker genes (Figure 3, Supplementary Table S3). Of the 41 cyanobacterial genomes recovered, 34 were classified as ‘Near complete’ (completeness ≥90%; contamination ≤5%); 4 were ‘Medium quality’ (completeness ≥70%; contamination ≤5%); and 3 were ‘Partial’ (completeness ≥50%; contamination ≤5%).
In terms of genome fragmentation, all genomes presented fewer than 2000 contigs, and more than 50% of the recovered genomes (21/41) were composed of fewer than 500 contigs. As for N50, 26 out of the 41 recovered genomes displayed values >10 kb. The values on this parameter were heterogeneous, varying between 2 and 200 kb, which can be partly explained by the different sequencing output obtained during the sequencing step for different strains (Figure 4).
As for genome coverage, 29 out of the 41 recovered cyanobacteria genomes presented average sequencing depths above 10× (Supplementary Table S4).
Additionally, by applying the quality criteria defined by Parks et al. [48] (completeness—5× contamination > 90; N50 > 10 kb; #contigs < 1 k), we identified 26 cyanobacterial genomes as high quality. Out of the 15 lower-quality recovered genomes, 11 of them (all Microcystis, 1 P. agardhii and 1 P. mougeotii) could be associated with insufficient sequencing depth, which did not allow us to proceed with the ARG search.
Overall, the results confirm that most of the recovered genomes are suitable for downstream analyses, including ARG detection and taxonomic reclassification.
3.3. In Silico Analysis of Antibiotic Resistance Genes
All genomic bins were tested for the presence of known ARGs or close variants, including those obtained from incomplete genomes or belonging to out-target species, since they could also present valuable information about the antibiotic resistance determinants present in the bodies of water and the possible co-occurrence of similar genes between species in a community.
The in silico analysis of the assembled genomes for the presence of known ARGs showed the presence of 15 different ARGs detected using the ‘strict hit’ criteria: genes associated with antibiotic inactivation (aadA2-type, aadA6/aadA10-type, aadA8-type, aadA13-type, aadA16-type, aph(3")-Ib-type, aph(6)-Id-type, ampS-type, mphG-type), with antibiotic efflux (adeF-type; floR-type, mefC-type, qacH-type, tetD-type) and with antibiotic target alteration (ermF-type). Additionally, one gene associated with target alteration was detected as a ‘perfect hit’—sul1 (Table 1).
From the 16 genes detected on the assembled freshwater cyanobacteria and bacteria, only 4 were present in the recovered cyanobacteria genomes—adeF-type, mefC-type, mphG-type and ermF-type (Table 1, Figure 5). All ARG hits found on the cyanobacteria genomes appeared on assemblies previously classified as ‘High quality’, according to the criteria defined by Parks et al. [48].
The adeF-type gene was the most widely distributed ARG in cyanobacteria, found across species from the Nostocales order (using the strains’ reclassification, Supplementary Table S1): Anabaena torulosa (LMECYA 313), Chrysosporum berghii (LMECYA 246), Sphaerospermopsis sp. (LMECYA 182), Aphanizomenon spp. (LMECYA 009, LMECYA 040, LMECYA 089, LMECYA 237, LMECYA 253, LMECYA 328) and Cuspidothrix issatschenkoi (LMECYA 031, LMECYA 190) on a total of 11 recovered genomes (Table 1). This gene was the only ARG detected in the Nostocales genomes, as well as the only ARG identified in co-occurring bacteria from the Aphanizomenon spp. and Cuspidothrix issatschenkoi cultures (Table 1, Figure 5).
The remaining ARGs identified in the cyanobacteria (ermF-type, mphG-type and mefC-type) were detected exclusively in the genus Planktothrix (Table 1). The ermF-type gene was detected on the P. agardhii (LMECYA 280) and P. mougeotii (LEGE 06224, LEGE 06226 and LEGE 07231) recovered genomes, while mphG-type and mefC-type were found exclusively on P. mougeotii samples (LEGE 06224 and LEGE 06226).
The distribution of these cyanobacterial genes according to the origin of the cyanobacterial strains shows that the adeF-type gene has a wide occurrence through freshwater, both from urban (mostly located on the coast) and rural areas (mostly those in the interior of the country) (Figure 6). The ermF-type gene was detected both in fresh- and wastewater, while the mefC-type and mphG-type genes were detected exclusively in cyanobacteria isolated from the WWTP (sampling point 21) (Figure 6).
Beyond the ARGs identified in cyanobacteria, we also detected several resistance genes exclusively in the associated bacterial genome bins, highlighting the broader resistome present in these non-axenic cultures. For example, sul1, a well-known sulfonamide resistance gene, was found as a perfect hit in the bacterial bins associated with the Dolichospermum (LMECYA 161) and Planktothrix (LMECYA 230, LEGE 06224, LEGE 07227) strains, showing a conserved sequence length and identity. Aminoglycoside resistance genes such as aadA2, aadA13 and aadA16 (involved in antibiotic inactivation via adenylyltransferases), as well as aph(3’’)-Ib and aph(6)-Id, were detected in bacteria co-occurring with the Dolichospermum (LMECYA 161), Planktothrix agardhii (LMECYA 230, LMECYA 280, LMECYA 283, LMECYA 292) and Planktothrix mougeotii (LEGE 06224, LEGE 06225, LEGE 06226, LEGE 06233, LEGE 07227, LEGE 07231) strains. Notably, aph(3″)-Ib, aph(6)-Id and floR (a chloramphenicol/florfenicol efflux pump) were frequently co-located in bacterial genomes from Planktothrix mougeotii cultures, suggesting the presence of a multidrug resistance region.
4. Discussion
The main goal of this work was to predict the presence of ARGs in cyanobacteria isolated from freshwater and WWTP water samples to understand their putative role on the aquatic resistome using a new bioinformatic pipeline. For this purpose, non-axenic cultures of cyanobacterial strains from these environments (and whose antibiotic susceptibility phenotype we previously determined [9,10]) were analyzed by WGS to assess the presence of ARGs in the cyanobacterial genomes. The genomic data generated also allowed the discrimination of the bacteria in co-culture with the cyanobacteria under study, and consequently distinguished ARGs from both microorganisms. The genome recovery approach yielded high-quality assemblies in most cases, with completeness and contamination metrics that met the accepted standards for metagenome-assembled genomes [48], supporting the robustness of our binning strategy.
The most widely detected ARG across all samples was adeF-type, which was found in 25 genomic assemblies. This gene was present in 11 cyanobacterial bins, 1 belonging to Anabaena torulosa, Chrysosporum bergii and Sphaerospermopsis sp., 6 to Aphanizomenon spp. and 2 to Cuspidothrix issastschenkoi. It was also the only ARG detected on the genomes of cyanobacteria from the Nostocales order, as well as the only ARG on co-occurring bacteria from the Aphanizomenon and Cuspidothrix cultures. The adeF gene codifies to a membrane fusion protein of the multidrug efflux complex AdeFGH. Bacteria exhibiting this gene have shown enhanced resistance to fluoroquinolone and tetracycline antibiotics, as well as chloramphenicol, clindamycin, trimethoprim, sulfamethoxazole and sodium dodecyl sulfate, and dyes such as ethidium bromide, safranin O and acridine orange [49]. The identification of the adeF-type gene in the studied cyanobacterial genomes could thus explain the reduced susceptibility phenotype of freshwater Chrysosporum bergii (LMECYA 246) and Aphanizomenon gracile (LMECYA 40) to trimethoprim and nalidixic acid, as we previously described [9]. In a previous study, we also reported the reduced susceptibility of Planktothrix agardhii and Planktothrix mougeotti strains to trimethoprim, norfloxacin and nalidixic acid [10]. However, in the present study, we did not detect the adeF gene in these species, which suggests another mechanism underlying this resistant phenotype, including intrinsic resistance, as we previously hypothesized [10]. It is interesting to note that in 9 out of the 11 adeF-type genes detected on the cyanobacterial binned genomes, the gene was also present in bins belonging to bacteria on the same sample, suggesting HGT of this ARG under selective pressure. Although no direct genetic evidence of HGT, such as ARGs within mobile genetic elements or flanking insertion sequences, was identified in this study, this co-occurrence provided indirect support for HGT between the cyanobacteria and co-existing bacteria. Future studies using long-read sequencing and functional validation assays are required to confirm the mobility and expression of these ARGs. Nevertheless, the potential for ARG mobility was already predicted in cyanobacterial genomes by Timms et al. [16], namely for genes associated with resistance to aminoglycosides and beta-lactams. Previously, Wang and Hong [50] reported the ubiquity of adeF and other efflux pump-associated ARGs in samples from reclaimed water distribution systems, namely from POU (Point of Use). These authors hypothesized that the enrichment of these environments with efflux pump mechanisms may be due to chemical stressors (like chlorine) and/or to antibiotic residues that are commonly present in the water samples [50]. Indeed, quinolones and tetracyclines are relevant prescribed antibiotics in human and veterinary medicine in Portugal, respectively [51], so it is not surprising that they are important freshwater contaminants underlying resistance mechanisms in the environment.
Another potentially interesting gene detected in four Planktothrix samples (P. agardhii LMECYA 280; P. mougeotii LEGE 06224, LEGE 06226 and LEGE 07231) was ermF. The ermF gene encodes for a protein that confers cross-resistance to macrolides, lincosamides and streptogramin B (MLSb) [52], meaning that it is part of the RNA methyltransferase family of the 23S ribosomal RNA conferring degrees of the MLSb phenotype [53]. Considering the clinical relevance, prevalence in the environment, association with MGE and HGT potential of the ermF gene [54], we hypothesize that cyanobacteria might acquire this gene from their bacterial neighbors in aquatic habitats, contributing to their putative role in the aquatic resistome. Indeed, ermF was suggested as a genetic indicator for assessing resistance to macrolides in the environment [54]. Actually, Timms et al. [16] also found an erm gene (ermC) associated with a plasmid replicon in the genome of the cyanobacteria C. issatschenkoi CHARLIE-1.
In the present study, we also found that two out of four occurrences of the ermF-type gene in our Planktothrix samples appeared associated with mefC-type and mphG-type genes. This association was also present in a bacterium from our P. mougeotii LEGE 06225 culture. The mefC gene encodes for a macrolide efflux pump capable of facilitating macrolide antibiotics out of the cell, while mphG encodes for a macrolide 2′-phosphotransferase responsible for the inactivation of macrolide antibiotics. These genes have previously been described as transferring together [55] as part of a multidrug plasmid from the species Photobacterium damselae, a Gram-negative bacteria that is pathogenic for marine organisms. Nonaka et al. [55] tested how the presence of these genes together on E. coli cells would affect the minimum inhibitory concentration (MIC) of antibiotics on the samples. They found that E. coli started displaying “high-level macrolide resistance via combined expression of the efflux pump and macrolide phosphotransferase”. The observations made by these authors represent the first registered occurrence of this novel gene pair on marine bacteria [55]. To our knowledge, our results also represent the first report of this gene pair on freshwater cyanobacterial cells. The resistance phenotype of the studied cyanobacteria to macrolides should be further investigated considering that we did not include antibiotics from this class in our previous studies. Our results also hint at a possible new association with ermF, reinforcing the resistance to macrolides even further. This association should be deeply studied on Planktothrix using targeted sequencing methods in order for it to be better understood.
In our study, it is also interesting to note that macrolide resistance genes were mainly detected in cyanobacteria/bacteria isolated from the WWTP. Indeed, a report of antibiotic residues in the WWTP final effluents of seven European countries showed that the macrolides azithromycin and clarithromycin were ubiquitous, presenting their maximal concentrations in Portuguese samples (1577.3 ng/L and 346.8 ng/L, respectively) [56]. However, in parallel to the WWTP, we cannot exclude the hypothesis that freshwater cyanobacteria (and their associated bacteria) may develop macrolide resistance mechanisms, considering that they could be exposed to macrolide residues originating from veterinary use that could end up in surface freshwater reservoirs located in rural areas [57]. In fact, erythromycin, clarithromycin and azithromycin were included in the 2015 and 2018 EU Watch List of potentially hazardous compounds for the aquatic environment, meaning that their occurrence in the environment creates a potential risk for humans and other living organisms, but the knowledge about this risk is not sufficient. They were removed from the 2020 revised version of this list, but they are still categorized as Highest Priority/Critically Important Antimicrobials by the WHO [57]. In addition, the biotic and abiotic degradation of macrolides in the environment can lead to the formation of by-products with under-investigated ecotoxicological impacts [58].
In summary, our research on ARGs in cyanobacterial genomes revealed the presence of four variants conferring resistance in pathogenic bacteria to tetracyclines, fluoroquinolones (adeF-type) and macrolides (ermF-type, mefC-type and mphG-type). The gene adeF-type was found across genomes from the Nostocales order. The genes ermF-type, mefC-type and mphG-type were found across Oscillatoriales genomes. Interestingly, Timms et al. [16] recently predicted ARGs conferring resistance to several antibiotics (β-lactams, chloramphenicols, tetracyclines, macrolides and aminoglycosides) in 10% of a total of 862 analyzed cyanobacterial genomes, with Nostocales (23%) and Oscillatoriales (8%) genomes being those with the highest percentage of ARGs.
Our results also revealed that all the other detected ARGs were located in genomic bins belonging to bacteria present in the cyanobacterial cultures under study. The sul1 gene was the most conserved ARG found in our samples, being detected in four different samples with similar sequence lengths and perfect identity scores related to those of the CARD reference variant. This gene confers resistance to sulphonamides via antibiotic target replacement and has been described as abundant worldwide and present in a large variety of species [59,60]. Another interesting result found in six bacteria from the cyanobacterial cultures was the aph(3″)-Ib-type, aph(6)-Id-type and floR-type association. APH(3″)-Ib and APH(6)-Id promote the inactivation of aminoglycoside antibiotics, and their simultaneous presence has been strongly associated with streptomycin resistant phenotypes in Corynebacterium striatum [61], an emerging multidrug-resistant skin and mucous membrane pathogen [62]. The floR gene, on the other hand, is a well-described efflux pump gene conferring resistance to phenicol [63]. The reason for this association between the three genes is uncertain. One of the plausible explanations is that it presents a fitness advantage in a wastewater environment with the presence of multiple antibiotics, and were then transferred through a multidrug resistance plasmid. All these ARGs, while not found in cyanobacterial genomes, reflect the potential for gene exchange in shared environments and reinforce the ecological relevance of studying cyanobacteria within their microbial associations.
Together with other studies [12,13,14,15,16], our results put in evidence the role of cyanobacteria and their associated bacterial communities as potential drivers of antibiotic resistance in water environments. Furthermore, the other partial ARG hits detected on the cyanobacterial bins can also be used as a guide in choosing antibiotics to be screened in phenotypic approaches with cyanobacteria strains. These partial hits can also eventually be used in new studies for the discovery of putative new ARGs.
This newly developed CyanoPipeline integrates quality filtering, assembly, taxonomic classification, genome binning and ARG prediction into a unified and fully automated workflow. This comprehensive approach enhances reproducibility and significantly reduces manual intervention compared to conventional multi-tool pipelines. Unlike general-purpose metagenomic workflows such as Anvi’o [64], MetaWRAP [65] and ATLAS [66], which require extensive user input and lack optimizations for cyanobacterial genomes, CyanoPipeline incorporates a custom bin selection algorithm based on weighted GC content and genome size, improving its accuracy in identifying target cyanobacterial bins and minimizing user bias. Utilizing MaxBin2 to bin contigs as short as 500 bp enables the effective recovery of fragmented genomes from complex, non-axenic cultures. The pipeline successfully recovered 41 cyanobacterial genomes, 26 of which met high-quality standards (completeness ≥ 90%, contamination ≤ 5%), achieving recovery rates that match or exceed those reported in comparable metagenomic studies [16,65,67,68]. Moreover, by integrating ARG prediction alongside genome reconstruction, the pipeline offers a scalable and reproducible solution tailored for high-throughput environmental resistome analyses in cyanobacterial metagenomes.
A final note to mention is that as a parallel output, this work is also useful for the reclassification of some cyanobacteria strains from the ESSACC culture collection, as referred to in the Section 3.1. The misidentification of cyanobacteria is a common situation, being particularly problematic in comparative studies. In the case of AMR, it may lead to the biased interpretation of the ARG distribution among different genus/species, which may compromise the identification of species- or genus-specific resistance traits, such as intrinsic antibiotic resistance or the potential for ARG transfer.
5. Conclusions
In this study, we use a new genome binning approach to identify ARGs in environmental (freshwater and wastewater) cyanobacteria. The whole genome sequencing of non-axenic cultures of cyanobacteria represents a straightforward, accurate and effective strategy to explore genomic features in hard-to-isolate organisms, given the availability of good sequencing depth assemblies. This procedure enabled us to overcome the complexity and specificity of non-axenic cyanobacteria cultures and allowed us to obtain 26 high-quality cyanobacterial genomes out of 41 sequenced cultures. Our results also indicate that cyanobacteria might play a putative role in the emergence and/or dissemination of ARGs since, though presenting a low diversity of ARG types, several variants of known ARGs were detected across the samples from distinct genera of cyanobacteria: adeF-type, ermF-type, mphG-type and mefC-type; these genes had never been previously described on freshwater cyanobacterial genomes, which is a significant contribution to the present field of the aquatic (cyano)resistome.
Overall, these findings highlight the importance of studying the role of cyanobacteria in the environmental resistome from a One Health perspective. Indeed, increasing evidence points out the participation of cyanobacteria and their associated phycosphere in the cycle of AR dissemination in aquatic environments. Cyanobacteria blooms may function both as receptors of ARGs (e.g., from livestock pathogenic bacteria in grazing areas or from wastewater discharges) and as vehicles for ARG dissemination to animals (livestock, as well as wild and companion animals) and humans through exposure in surface freshwater reservoirs. On the other hand, a possible role of cyanobacterial toxins in exacerbating HGT in ARGs in microbial biofilms in drinking water treatment plants and the potential stimulation of cyanotoxin production by antibiotics have also been considered as additional issues when considering the complexity of the relation between cyanobacteria and antibiotic pollution (antibiotic residues, resistant bacteria, eDNA) [69]. In this context, cyanobacterial blooms should be considered in the scope of AMR monitoring in water bodies, namely in surface freshwaters where antibiotic pollution is a concern.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Banin E. Hughes D. Kuipers O.P. Bacterial pathogens, antibiotics and antibiotic resistance FEMS Microbiol. Rev.20174145045210.1093/femsre/fux 01628486583 · doi ↗ · pubmed ↗
- 2WHO Antibiotic Resistance 2021 Available online: https://www.who.int/news-room/fact-sheets/detail/antibiotic-resistance(accessed on 6 April 2022)
- 3Brown E.D. Wright G.D. Antibacterial drug discovery in the resistance era Nature 201652933634310.1038/nature 1704226791724 · doi ↗ · pubmed ↗
- 4Surette M.D. Wright G.D. Lessons from the environmental antibiotic resistome Annu. Rev. Microbiol.20177130932910.1146/annurev-micro-090816-09342028657887 · doi ↗ · pubmed ↗
- 5Wright G.D. The antibiotic resistome: The nexus of chemical and genetic diversity Nat. Rev. Microbiol.2007517518610.1038/nrmicro 161417277795 · doi ↗ · pubmed ↗
- 6Baquero F. Martínez J.-L. Cantón R. Antibiotics and antibiotic resistance in water environments Curr. Opin. Biotechnol.20081926026510.1016/j.copbio.2008.05.00618534838 · doi ↗ · pubmed ↗
- 7Zhang X.-X. Zhang T. Fang H.H. Antibiotic resistance genes in water environment Appl. Microbiol. Biotechnol.20098239741410.1007/s 00253-008-1829-z 19130050 · doi ↗ · pubmed ↗
- 8Caniça M. Manageiro V. Jones-Dias D. Clemente L. Gomes-Neves E. Poeta P. Dias E. Ferreira E. Current perspectives on the dynamics of antibiotic resistance in different reservoirs Res. J. Microbiol.201516659460010.1016/j.resmic.2015.07.00926247891 · doi ↗ · pubmed ↗
