Community and functional stability in a working bioreactor degrading 1,4-dioxane at the Lowry Landfill Superfund Site
Jessica L. Romero, Jack H. Ratliff, Christopher J. Carlson, Daniel R. Griffiths, Christopher S. Miller, Annika C. Mosier, Timberley M. Roane

TL;DR
This study explores microbial communities and enzymes involved in degrading 1,4-dioxane at a real-world remediation site.
Contribution
The work provides insights into microbial and enzymatic mechanisms of dioxane degradation in a large-scale bioreactor.
Findings
Microbial communities in the bioreactor included Pseudonocardia, potentially involved in dioxane degradation.
Group V SDIMOs and DxmA-like proteins were identified, linked to dioxane degradation.
Evidence of horizontal gene transfer was found, including toxin-antitoxin and transposable elements.
Abstract
1,4-dioxane (dioxane) is an emerging contaminant that poses risks to human and environmental health. Bacterial dioxane degradation is increasingly being studied as a method to remove dioxane from contaminated water. However, there is a lack of studies on microbial community structures and functions within efficient, large-scale, biodegradation-based remediation technologies. The Lowry Landfill Superfund Site (Colorado, USA) uses an on-site, pump-and-treat facility to remove dioxane from contaminated groundwater by biodegradation. Here, 16S rRNA gene and shotgun metagenomic sequencing were used to describe microbial community composition, soluble di-iron monooxygenase (SDIMO) alpha hydroxylases, and potential for dioxane degradation and horizontal gene transfer in bioreactor support media from the facility. Support media showed diverse microbial communities dominated by Nitrospiraceae,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig 1
Fig 2
Fig 3
Fig 4| Clustering parameters | Protein-level clustering | Nucleotide-level clustering | |||
|---|---|---|---|---|---|
| c | aS | Total SDIMOs | Number of DxmA sequences | Total SDIMOs | Number of DxmA sequences |
| 1 | NA | 38 | 14 | 45 | 15 |
| 0.99 | NA | 30 | 9 | 32 | 11 |
| 0.97 | NA | 24 | 7 | 28 | 10 |
| 0.95 | NA | 22 | 7 | 24 | 9 |
| 0.9 | NA | 10 | 2 | 19 | 7 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMicrobial bioremediation and biosurfactants · Microbial Community Ecology and Physiology · Methane Hydrates and Related Phenomena
INTRODUCTION
The organic chemical 1,4-dioxane (dioxane) is an emerging contaminant that has recently gained much attention due to its environmental prevalence and recognition as a possible carcinogen (1). Used since the 1920s as a degreaser and then shortly as a chlorinated solvent stabilizer, dioxane is used in many industrial processes including the production of plastics, glues, cosmetics, antifreeze, pharmaceuticals, textiles, pesticides, and other materials (2–4). Given the various uses of dioxane, it is frequently released from industrial and consumer activities. For instance, U.S. industries were estimated to have released 705,000 pounds of dioxane into the environment in 2015 (5). However, this is likely an underestimate as it does not include, for example, releases as an unwanted chemical byproduct in industrial wastewater (5, 6). In addition, consumer use of products frequently created or contaminated with dioxane during manufacturing, such as soaps and laundry detergents, releases dioxane into waterways and landfills (2, 7). Waterway releases and landfill leachate may enter municipal wastewater treatment systems (7), which are estimated to remove 5% of dioxane inputs at the most (5). Dioxane is not readily broken down by these treatment plants due to its miscibility and the stability of its two ether bonds (8). Due in part to these properties, plus its hydrophilic nature and low volatility, dioxane tends to associate with water (9). Consequently, dioxane has been recognized as a drinking water contaminant as early as 1978 (5). In 2015, the U.S. Environmental Protection Agency (EPA) detected dioxane in 21% of tested drinking water sources in the United States (3). Although EPA classifies dioxane as a potential human carcinogen, the agency does not currently enforce a federal dioxane release limit (3, 7). Therefore, dioxane limits for surface and groundwater differ across the United States (1, 3). In 2004, Colorado became the first state to set a dioxane limit of up to 6.1 µg/L for surface water and groundwater to be met by March 2005 (2). The Colorado dioxane limit has since decreased to 0.35 µg/L (1). Recently, states like New York have begun to restrict dioxane levels in personal care products and cleaning products (7, 10).
Several methods have been tested to remove dioxane from contaminated water with varying successes. Certain physical removal methods are inefficient as they only change the physical phase of or concentrate dioxane without degrading it (3). Chemical methods known as advanced oxidation processes (AOPs) completely degrade dioxane to carbon dioxide with high efficiency but are relatively expensive and less effective in treating chemical mixtures (3, 5, 8, 11, 12). Lastly, biological methods remove dioxane through degradation by microorganisms (biodegradation). Two biodegradation mechanisms exist: direct metabolism, a growth-linked process in which dioxane is directly degraded to yield carbon and energy (13), and co-metabolism, a non-growth-linked process in which dioxane is partially degraded in a fortuitous manner (13, 14). Co-metabolism occurs in the presence of a non-dioxane co-substrate that induces dioxane degradation (13). Co-substrates that have induced dioxane degradation include tetrahydrofuran (THF), methane, propane, butanol, and toluene (15).
To date, approximately 30 bacterial species and three fungal species have shown dioxane degradation (13, 15, 16). The best-characterized dioxane-degrading bacterial strain is Pseudonocardia dioxanivorans CB1190 (CB1190), isolated from dioxane-contaminated industrial sludge (17). Under aerobic conditions, CB1190 directly metabolized 50% of dioxane in a 4.0 mM (~0.35 g/L) culture in 18 hours (17). A later study estimated a dioxane degradation rate for CB1190 at 0.19 ± 0.007 mg/h/mg protein when supplied with 50 mg/L (~0.57 mM) of dioxane (18). An example of a dioxane co-metabolizing bacterium is Methylosinus trichosporium OB3b (OB3b), which co-metabolized 50 mg/L (~0.57 mM) of dioxane at a rate of 0.38 ± 0.02 mg/h/mg protein when supplied with a 25% (vol/vol) methane gas addition and no copper salt addition (18). It should be noted that initial dioxane concentration may affect its degradation rate. Using a model simulation, one study predicted that sites with initial dioxane concentrations of <10 mg/L were more quickly remediated by co-metabolism, whereas initial concentrations of >10 mg/L were more quickly remediated by direct metabolism (19). Direct metabolism of dioxane by CB1190 was predicted to be faster at higher initial concentrations as this would provide more carbon and energy to support growth (19).
While dioxane degradation has been studied extensively in microbial isolate strains, recent work has also shown synergistic degradation by complex microbial communities (20). Using an activated sludge enrichment, one study proposed a dioxane degradation pathway in which the bacteria Xanthobacter spp. and Rhizobiales contributed enzymes that initiated hydroxylation, which was then followed by subsequent pathway steps carried out by other bacteria (20). Another study suggested that certain microbial community members, like Ancylobacter polymorphus ZM13, initiated dioxane degradation whereas other members, like Xanthobacter and Mesorhizobium, metabolized intermediates and alleviated stress responses (21).
Few studies have directly applied dioxane biodegradation toward wastewater and landfill contamination. One study achieved 81.26 ± 6.17% dioxane removal from polyester factory wastewater using a pilot-scale, multistaged reactor inoculated with anaerobic digester sludge (22). Another study reduced dioxane below detection levels (<0.38 µg/L) in landfill groundwater using microcosms bioaugmented with bacterial dioxane-degrading strains (23). Despite these advances, there is currently a lack of literature on microbial community responses within large-scale remediation technologies showing stable, efficient dioxane biodegradation.
The enzymes that initiate microbial dioxane degradation are known as soluble di-iron monooxygenases (SDIMOs). Six SDIMO groups exist based on phylogeny, substrate range, sequence identity, and other traits (15, 24). To date, SDIMOs with potential dioxane degradation capabilities have been identified from all groups, except for Group IV (25). Recently, eight candidate dioxane-degrading SDIMOs were described and searched for broadly in environmental metagenomes (15). This study and others have led to an expanded understanding of the environmental distributions of various dioxane-degrading SDIMO groups. SDIMO composition may be affected by environmental conditions (15, 26–28), with some candidate dioxane-degrading SDIMOs potentially being restricted to certain environments (15). Also, SDIMO groups may provide insights on dioxane degradation mechanisms and mobility. For instance, direct dioxane metabolism (as opposed to co-metabolism) and association with plasmids have both been observed in certain Group V and VI SDIMOs (Dxm of CB1190 and Prm of Mycobacterium dioxanotrophicus PH-06, respectively) (15). Lastly, because different SDIMO groups have varying substrate ranges beyond dioxane (e.g., tetrahydrofuran, propane, and methane) (18), a better understanding of the phylogenetic distribution of direct and co-metabolic dioxane degradation among SDIMO groups could allow for better inference of dioxane prediction and provide insights into other chemicals that may be degraded alongside dioxane. Ultimately, further characterization of SDIMO presence, evolutionary relatedness, structures, and functions in the context of dioxane remediation could better elucidate dioxane biodegradation mechanisms and potentially reveal strategies to select for and/or enhance these mechanisms.
Here, we evaluated the bacterial degradation of dioxane in an on-site flow-through wastewater bioremediation plant at the Lowry Landfill Superfund Site (Aurora, Colorado) (12). In this study, 16S rRNA and shotgun metagenomic sequencing were used to i) identify the microbial communities potentially involved in dioxane degradation at the Lowry Landfill treatment plant, ii) describe the Lowry SDIMOs potentially involved in dioxane degradation and compare these to SDIMOs from the literature, and iii) examine the Lowry protein-coding genes genomically located near the SDIMOs to characterize potential dioxane degradation mechanisms. This work is novel, in that it describes the structures and functions of an in-compliance bioremediation treatment plant. These findings will improve our understanding and management of biological remediation of 1,4-dioxane.
MATERIALS AND METHODS
Site description
The Lowry Landfill Superfund Site in Aurora, Colorado, is a dioxane-impacted site that uses in situ microorganisms in the aboveground bioreactor treatment of dioxane-containing groundwater. Accepting industrial and municipal wastes from 1964 to 1980, wastes were disposed of in 200 acres of unlined pits. While these disposal practices followed regulatory standards of the time, they eventually led to chemical contamination of the surrounding soil, sediment, surface, and groundwater (29, 30). In 1984, the U.S. Environmental Protection Agency designated the landfill a superfund site, prompting a series of remediation efforts (31). This included the installation of an on-site hydraulic control through the groundwater extraction system and aboveground treatment that originally used air stripping and granular activated carbon (32).
While the original treatment processes successfully removed volatile organic compounds from contaminated groundwater, it was later found that the treatment did not reduce dioxane. Furthermore, treated groundwater was re-injected in accordance with the Colorado State groundwater policy until 2004 (32). During this time, it was also found that this practice resulted in a dioxane plume extending from the north end of the landfill.
To address these issues, a new aerobic (micro)biological treatment system (BTS) was installed, which was composed of three aerated moving-bed bioreactors (bioreactors) (Fig. S1A and B) (29, 33). Bioreactor operation has been detailed previously, including the use of Kaldnes K1 polyethylene support media (support media) (Wigan, UK) (Fig. S1C) to provide a surface for microbial growth, coarse bubble diffusion for aeration, and controls for temperature at 23.5°C and pH at 7.0 (12). The three bioreactors concurrently treat groundwater collected from locations distributed across the landfill (12). The removal of dioxane and its known co-substrate tetrahydrofuran (THF) is specifically monitored in accordance with Colorado State permitted standards (12). Quantification of both compounds was performed by Eurofins Scientific (Arvada, CO) (https://www.eurofins.com/) using gas chromatography and mass spectrometry following EPA Method 8260B SIM for dioxane and EPA Method 8260B for THF (34). Dioxane removal efficiency of the BTS generally ranges from 90% to 98% (12) and effluent measures below 25 µg/L of dioxane on average (32). Since 2004, the landfill has discharged the treated effluent to a local municipal wastewater treatment plant under a discharge permit, allowing a limit of 220 µg/L of dioxane (12).
Sample collection
To characterize the microbial community and genomically encoded metabolic potential of the BTS, genomic DNA was extracted from biofilms growing on individual support media sampled on each of four sampling dates from Bioreactor 1 (Fig. S1). Samples were collected at four time points over 3 years: 2019-03-26, 2022-01-18, 2022-01-25, and 2022-03-22. Although these time points do not represent a true time series, they were selected to allow preliminary comparisons of microbial communities across short and long timescales for a single consistent bioreactor. At each sampling, support media were removed from Bioreactor 1 and placed into sterile 50 mL centrifuge tubes. Excess water was decanted from centrifuge tubes before sealing. Collected samples were stored at −20°C.
16S rRNA gene sequencing
To describe the microbial community composition of the Lowry Landfill treatment plant BTS, high-throughput sequencing of the 16S rRNA gene was performed on support media. As part of a CURE (course-based undergraduate research experience) during the Fall 2022 semester, students in the University of Colorado Denver General Biology teaching laboratory extracted total DNA for 16S rRNA gene amplification and sequencing. This allowed for a larger number of replicates sampled from each day. In total, 30 replicates for the four chosen sampling timepoints were available (2019-03-26, n = 8; 2022-01-18, n = 8; 2022-01-25, n = 8; 2022-03-22, n = 6). Using a quarter section of a single support media cut with a sterilized razor blade, DNA was extracted following the protocol from the QIAGEN DNeasy PowerSoil Kit (Hilden, Germany) with two modifications: i) because samples were aliquoted ~4 weeks ahead of time in lysis buffer (solution C1) and stored frozen at −20°C, samples were incubated at 60°C for 10 minutes prior to bead beating for 10 minutes on a laboratory vortex at maximum speed with a 24-tube adapter, and 2) DNA was eluted in 50 µL of kit elution buffer (10 mM Tris). To distinguish support media quarter sections from multiple replicates, support media quarters were assigned a label: “A,” “B,” “C” or “D.” Polymerase Chain Reaction (PCR) amplification of the V4 region of the 16S rRNA gene followed the Earth Microbiome Protocol (https://earthmicrobiome.org/) with the updated 515F (GTGYCAGCMGCCGCGGTAA) (35) and 806R (GGACTACNVGGGTWTCTAAT) (36) primer set. To achieve a dual-indexing sequencing strategy, PCR primers were modified to contain the full P5 and P7 Illumina adapters on the 5′ ends, and these adapters contained unique index dual barcode sequences (37). PCR was set up as follows: 2 µL forward and reverse primer mixture (5 µM each; 0.4 µM reaction concentration), 2 µL support media template DNA, 21 µL PCR-grade molecular water (final volume: 25 µL). Template DNA concentration was not quantified prior to PCR. This mixture was pipetted into bead tubes containing dried down Cytiva Hot Start Master Mix (Marlborough, MA, USA). Thermocycler conditions were as follows: 95°C for 3 minutes; 25 cycles of 95°C for 45 seconds, 50°C for 45 seconds, 72°C for 1 minute; 72°C final extension for 10 minutes. Successful gene amplification was confirmed through gel electrophoresis on a Lonza FlashGel System (Basel, Switzerland). The PCR product was purified and pooled by equal volume using the Zymo Research DNA Clean & Concentrator-5 Kit (Irvine, California). Libraries were subjected to high-throughput sequencing using the Illumina MiSeq platform with 2 × 250 (V2) paired-end reads at the University of Colorado Anschutz Medical Campus.
16S rRNA data analysis
The 16S rRNA data analysis protocols, including specific parameters for each step, can be found at https://github.com/jessieromero418/Lowry-2025-Paper/tree/main/16S. Raw sequencing data and Phred-33 quality scores were imported into QIIME 2 (Version 2023.5.0) with forward and reverse read files that pertained to each of the 30 available samples from the four sampling time points studied. Demultiplexed sequences were denoised within QIIME 2 using the DADA2 denoise-paired plugin (parameters: --p-trim-left-f 0, --p-trim-left-r 0, --p-trunc-len-f 250, --p-trunc-len-r 60, --p-max-ee-f 2, --p-max-ee-r 2, --p-trunc-q 2), and amplicon sequence variants (ASVs) were defined. Taxonomy was assigned to the resulting sequences with the feature-classifier classify-sklearn plugin and the Naïve Bayes classifier trained on the SILVA database (Release 138) (38, 39). To help confirm SILVA taxonomic assignments, the ASV representative sequences underwent a BLASTn search against National Center for Biotechnology Information (NCBI) prokaryote (nt_prok, downloaded 27 May 2024) and virus (nt_viruses, downloaded 5 June 2024) databases. Samples with less than 5,300 reads were removed, reducing this data set to 23 samples across our selected timepoints (2019-03-26, n = 5; 2022-01-18, n = 7; 2022-01-25, n = 6; 2022-03-22, n = 5). Additionally, ASVs that were identified as chloroplasts, mitochondria, were unassigned, or were only assigned to the domain level and had no BLAST nr hit were filtered out of the ASV table. The feature table and representative sequences derived from this filtered data set were used in downstream analysis. The BIOM table and ASV representative sequences are available in the Supplementary Materials. A taxonomy table was generated from this data set with the metadata tabulate plugin. A rooted phylogenetic tree was created using the phylogeny align-to-tree-mafft-fasttree plugin.
The resulting feature table, taxonomy table, rooted phylogenetic tree, and metadata file were uploaded to RStudio (Version 4.3.0) with the mia package (Version 1.8.0) (40) for further analyses and figure generation. To shorten taxonomic labels in plots, the BiocParallel (Version 1.34.2) (41), stringr (Version 1.5.0) (42), S4Vectors (Version 0.38.1) (43), stats4 (Version 4.3.0), and BiocGenerics (Version 0.46.0) (44) packages were used to remove rank prefixes (e.g., “p__” labels, indicating the phylum level, were removed). The mia scater (Version 1.28.0) (45) and patchwork (Version 1.1.2) (46) packages were used to run and display alpha diversity analyses. The phyloseq package (Version 1.44.0) (47) was used to convert the data, a TreeSummarizedExperiment object, into a phyloseq object. The phyloseq (Version 1.44.0) (47), ggplot2 (Version 3.4.2) (48), and patchwork (Version 1.1.2) (46) packages were used to run and display beta diversity analyses. The miaViz (Version 1.8.0), ggplot2 (Version 3.4.2) (48), and ggraph (Version 2.1.0) (49) packages were used to create taxonomy bar plots of the top 20 phyla, families, and genera. Taxa identified as “uncultured” and those not in the top 20 taxa were grouped under the “All other phyla,” “All other families,” or “All other genera” category.
Metagenomic shotgun sequencing
To describe SDIMO genes and visualize gene neighborhoods (spatially colocated genes in a region of a genome), high-throughput, whole-genome shotgun metagenomic sequencing was performed on support media. Support media from each of the four sampling time points (listed above) were processed in triplicates, resulting in a total of 12 samples. Using two quarter sections of a single support media cut with a sterilized razor blade, DNA was extracted with the QIAGEN DNeasy PowerSoil Kit (Hilden, Germany). Bead beating was performed on a laboratory vortex with an adapter for 12 tubes for 10 minutes, and no heating step was added. DNA was eluted in 50 µL of sterile molecular-grade water. DNA purity and concentration were assessed using NanoDrop OneC (Waltham, Massachusetts). The University of Colorado Anschutz Medical Campus Genomics Shared Resource (https://medschool.cuanschutz.edu/colorado-cancer-center/research/shared-resources/genomics) prepared libraries using the NuGEN/Tecan Ovation Ultralow System V2 kit and sequenced the DNA using the Illumina NovaSeq 6000 platform with 2 × 150 paired-end reads on an S4 flow cell lane.
Metagenomic assembly
The metagenomic shotgun sequencing analysis protocols from assembly to Tier 2 analyses, including specific parameters for each step, can be found at: https://github.com/jessieromero418/Lowry-2025-Paper/tree/main/Shotgun. Raw, forward, and reverse read sequencing data files were interleaved with reformat.sh (Version 38.96) (50) for each of the 12 samples. Illumina adapters were trimmed, low-quality bases were trimmed, duplicates were removed, and optical duplicates were removed from interleaved files with rqcfilter2.sh (parameters: rna = f, trimfragadapter = t, qtrim = rl, trimq = 0, maxns = 3, maq = 3, minlen = 51, mlf = 0.33, phix = t, removehuman = f, removedog = f, removecat = f, removemouse = f, khist = f, removemicrobes = f, clumpify = t, dedupe = t, opticaldupes = t, ddist = 12000,tmpdir=, barcodefilter = f, trimpolyg = 5, and usejni = f) (Version 38.96) (50). A sample of 10,000 resulting trimmed reads was assessed for quality with FastQP (Version 0.3.4) and FastQC (Version 0.11.9) (51). All FastQC results were aggregated to compare read quality across samples with MultiQC (Version 1.12) (52). Trimmed reads were deinterleaved with reformat.sh for downstream analysis. De novo assembly of deinterleaved, trimmed reads was performed with MEGAHIT and the --presets meta-large parameter (Version 1.2.9) (53). Each of the 12 samples’ reads was assembled individually. The resulting assemblies were assessed for quality with QUAST (Version 5.2.0) (54). All trimmed read sets were mapped against all assemblies with BBMap and the ambiguous = random parameter (Version 38.96) (50). The resulting 144 BAM files were sorted with samtools sort (Version 1.17) (55). Sorted BAM files required two steps of preprocessing for downstream analysis: i) contig information was trimmed from sorted BAM files with reformat.sh (Version 38.96) (50) and the parameter trimrname = t, and ii) MD tags were generated in reformatted, sorted BAM files with samtools calmd (Version 1.17) (55). Prodigal (Version 2.6.3) (56) was used with the -p meta parameter to predict gene sequences in the assembly contigs of all 12 samples.
Homology search for soluble di-iron monooxygenases
To assess the presence of soluble di-iron monooxygenases (SDIMOs) in bioreactor microbial communities, sequences pertaining to the alpha hydroxylase subunit of SDIMOs were searched for in the 12 whole-genome shotgun metagenomic sequencing samples. Contigs of all sizes were analyzed to avoid exclusion of potential plasmid sequences located on smaller contigs. BLASTp (Version 2.9.0) (57) was used to search for SDIMO alpha hydroxylase subunits in proteins predicted by Prodigal. Searches were conducted in protein space to characterize the physico-chemical properties of amino acid residues and describe potential protein functions. A total of 39 recently described SDIMO sequences (15) were used as query sequences against a database of all predicted Lowry proteins at an e-value of 1e^−50^. These were searched for as they represent a diverse, comprehensive set of candidate dioxane-degrading and non-dioxane-degrading SDIMOs that may aid in the identification of dioxane biodegradation mechanisms. Query sequences belonged to one of four categories: i) candidate dioxane-degrading proteins (CDDPs) from the literature with evidence of dioxane degradation, ii) outgroup proteins (OUTs) from the literature with no evidence of dioxane degradation, iii) composite proteins (COMPs), which were curated with BLASTp from the genomes of known dioxane-degrading bacteria after searching these for CDDPs, and iv) composite outgroup proteins (COMPOUTs), which were curated similarly to COMPs but were presumed not to degrade dioxane based on phylogenetic placement (15). The 39 SDIMO queries were assigned to Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) groups using GhostKOALA (https://www.kegg.jp/ghostkoala/) (58) to predict protein functions. Included among the 39 sequences was a Group IV SDIMO (COMP11, putative alkene monooxygenase alpha subunit [Mycolicibacterium rhodesiae JS60] AAO48576.1). Although Group IV SDIMOs have not been associated with dioxane degradation to date (25), this sequence was kept in this analysis for comparison. Also, although one of the 39 sequences lacked both DE*RH motifs of the carboxylate di-iron center (COMP7, soluble di-iron monooxygenase alpha subunit, partial [Pseudonocardia sp. D17] BAU36819.1), it was still considered as it was associated with direct metabolism of dioxane (15, 59). Descriptions of the 39 query sequences can be found in Tables S1 and S2.
Ranking soluble di-iron monooxygenases
A tiered approach was used to rank Lowry protein sequences obtained from BLASTp hits based on levels of evidence of encoding SDIMOs (a flowchart visualizing this approach is shown in Fig. S2). Tier 1 represented protein sequences with the most evidence of encoding SDIMOs, and these were the primary focus of this paper. Tier 1 was curated by filtering BLASTp outputs to only include proteins with a minimum of 90% amino acid identity with a sequence from a previous study (15), a minimum alignment length of 125 amino acids, and a minimum query coverage of 60%. This resulted in a total of 86 protein sequences in Tier 1, each originating from a separate contig (Fig. S2). Candidate SDIMOs were further evaluated by aligning the sequences with the 39 previously described SDIMO sequences (15) using MAFFT and the –op 1.53 and –ep 0.123 parameters (Version 7.123b) (60). Protein alignments were used to confirm the presence of one or two DERH motifs in carboxylate di-iron centers commonly associated with SDIMOs (61). Proteins were included in Tier 1 if they contained at least one DERH di-iron center motif.
A phylogenetic tree was generated from the protein alignment using FastTree (62) and then annotated in iTOL (63). The tree was re-rooted with Group V SDIMO outgroup sequences. Using monophyletic clades in the tree containing a documented dioxane-degrading sequence, Tier 1 sequences were then categorized as either potential dioxane-degrading proteins or potential outgroup proteins. Clading patterns with previously annotated sequences were also used to aid sequence classification into SDIMO Groups I–VI. To provide more evidence of potential function, proteins were assigned to KO groups using GhostKOALA (58).
Abundances for all 86 Tier 1 sequences were estimated by calculating contig coverages with CoverM (Version 0.6.1) (64) using the reads per kilobase per million mapped reads (RPKM) method. These values were displayed as bar charts in the tree and represent sequence abundances within the sample from which they originated.
To remove identical and near-identical contigs assembled from multiple samples, the contigs from which the 86 Tier 1 protein sequences originated were dereplicated in nucleotide space with CD-HIT-EST (c = 0.999, aS = 0.999) (Version 4.8.1) (65, 66) (Fig. S2). This resulted in a set of 57 non-redundant contigs, each containing one SDIMO alpha hydroxylase sequence. The SDIMO alpha hydroxylase protein sequences pertaining to the 57 nonredundant contigs were aligned and then visualized in a phylogenetic tree with the 39 SDIMO sequences from a prior study (15) as described above. To manually cluster the Tier 1 tree and analyze general trends across SDIMOs, the most abundant sequence was identified within each clade (Fig. S2). This sequence was then used as the representative Lowry sequence for each clade. If the most abundant sequence was found to contain a gap in the first di-iron center of the alpha hydroxylase (i.e., the first DE*RH motif was missing in the protein alignment), then the next most abundant sequence that contained this structure was used as the representative Lowry sequence instead. When selecting representatives, the presence of the first di-iron center region was prioritized over the second region as it is near the hydrophobic residues that determine SDIMO group and function (67, 68). Using protein alignments, these hydrophobic residues helped confirm SDIMO classification into Groups I–VI. This reduced the 57 Lowry protein sequences to seven clusters, each with one representative sequence. These seven cluster representative sequences and the 39 previously identified SDIMO (15) were aligned as described previously and visualized in Microsoft Excel and Jalview (Version 2.11.2.7). A phylogenetic tree was generated from this alignment as described previously. To compare cluster abundances, the abundances of the original 86 Tier 1 protein sequences were summed within each cluster if they originated from the same sample. These summed values were displayed in heatmaps in the clustered tree and represent total abundances of sequences within each cluster within each sample.
To compare sequence diversity, the SDIMO alpha hydroxylases within the 57 nonredundant contigs were clustered in protein space with CD-HIT (Version 4.8.1) (65, 66) and in nucleotide space with CD-HIT-EST (Version 4.8.1) (65, 66) at various sequence identity thresholds (c = 1, 0.99, 0.97, 0.95, and 0.9).
Tier 2 represented Lowry protein sequences with less evidence of encoding SDIMO alpha hydroxylases. This was curated by filtering BLASTp outputs to include proteins with 50%–90% amino acid identity with a sequence identified by Goff and Hug (15), a minimum alignment length of 125 amino acids, and a minimum query coverage of 60%. This resulted in a total of 622 potential Tier 2 SDIMOs that originated from 606 separate contigs (Fig. S2). Contigs that appeared in Tier 1 were removed from this set, resulting in 564 Tier 2 contigs. The 564 Tier 2 contigs were also dereplicated to remove identical and near-identical contigs in nucleotide space with CD-HIT-EST (c = 0.999, aS = 0.999) (Version 4.8.1) (65, 66) (Fig. S2). This resulted in 355 nonredundant contigs, containing 367 Tier 2 SDIMOs. The 367 Tier 2 protein sequences and the 39 previously described SDIMO sequences (15) were aligned with MAFFT and the –op 1.53 and –ep 0.123 parameters (Version 7.123b) (60). An unrooted phylogenetic tree was generated from the protein alignment using FastTree (62) and then annotated in iTOL (63). Tier 2 sequences were colored by the predicted SDIMO group based on clading patterns with literature sequences.
Preliminary homology search for intermediate-degrading enzymes
To search for potential evidence of dioxane degradation pathway steps after initial hydroxylation by monooxygenases, sequences pertaining to enzymes previously proposed to be involved in intermediate degradation were searched for in the 12 whole-genome shotgun metagenomic sequencing samples. Query sequences were obtained from three different studies on gene expression during dioxane degradation in bacterial strains and microbial consortia (20, 21, 69). These sequences were (i) upregulated during growth on dioxane or glycolate in Pseudonocardia dioxanivorans CB1190 in a microarray experiment (n = 69) (69), (ii) the most abundant enzymes detected during dioxane degradation according to shotgun metagenomic data in a microbial consortium derived from activated sludge (and were later upregulated according to qPCR) (n = 5) (20), or (iii) upregulated during growth on dioxane according to metatranscriptomic data in a microbial consortium derived from activated sludge (n = 38) (21). NCBI accession numbers of the 112 query sequences were checked to confirm that no duplicates were present. The 112 protein queries were assigned to Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) groups using GhostKOALA (58). BLASTp (Version 2.9.0) (57) was used to search for the 112 protein queries in all predicted proteins at an e-value of <1e^−50^.
Gene neighborhoods surrounding soluble di-iron monooxygenases
The gene neighborhoods associated with the Lowry contigs containing Tier 1 Group V, DxmA SDIMO alpha subunit genes (n = 17 contigs) were analyzed. All 17 contigs were annotated in the DOE Systems Biology Knowledgebase (KBase) (70) with the “Annotate and Distill Assemblies with DRAM” (v0.1.2) application (71) and the “Annotate Metagenome Assembly and Re-annotate Metagenome with RASTtk” (v1.9.5) application (72). This provided gene annotations from DRAM (73), RAST (74), KEGG (75), and the Protein Families Database (PFAM) (76). Predicted proteins were also analyzed using a protein BLAST against the NCBI non-redundant database (57). For protein BLAST results, the annotation with the lowest E-value was selected, except in cases where the protein was classified as hypothetical or unknown. In these instances, the next lowest E-value hit with an annotated protein was considered, provided the E-value remained below 1e^−05^. Final annotations were assigned when two or more of the annotation methods (DRAM, RAST, KEGG, PFAM, and BLASTp against nr) agreed on the same annotation. In instances where none of the annotations agreed, the BLASTp result was taken (as long as the E-value <1e^−05^) and the annotation was named as “putative.” If no annotations were identified in any of the methods, the protein was called “hypothetical.” To gather more evidence of potential horizontal gene transfer, the 17 contigs were evaluated for plasmid and viral sequences using geNomad (Version 1.7.0) with default parameters (77). Putative transposase gene sequences were evaluated with TnCentral using default parameters (78). Putative toxin/antitoxin gene sequences were evaluated with TAfinder 2.0 using default parameters (79). Putative miniature inverted-repeat transposable elements (MITEs) were evaluated with MITE Tracker using default parameters (80).
The 17 DxmA contigs were aligned using Geneious Prime (Version 2024.0.7; https://www.geneious.com/) with the “map to reference” feature utilizing the Geneious mapper on the highest sensitivity using “find structural variants, short insertions, and deletions of any size,” with contig S3_k127_1715793 as the reference sequence. Four contigs containing candidate dioxane-degrading proteins identified previously (15) were also aligned with the Lowry contigs. Phylogenetic analysis was done using RAxML Version 4 (81) with a GTR GAMMA nucleotide model and a rapid hill-climbing algorithm and 100 bootstrap replicates. One additional reference sequence (Pseudonocardia asaccharolytica DSM 44247; NZ_AUII01000002) trimmed to only include the dioxane monooxygenase alpha, beta, coupling, and reductase components) was used to root the tree.
RESULTS
Site chemistry
Site operators regularly take measurements of key chemical concentrations flowing into the BTS. For the 4 days sampled (2019-03-26, 2022-01-18, 2022-01-25, and 2022-03-22), input dioxane concentrations ranged from 15,000 µg/L to 28,000 µg/L (Fig. 1). To better illustrate the dioxane concentrations across the 3-year sampling period, the median of all weekly measurements from the first sampling date to the last sampling date (n = 196) was 19,000 µg/L (Fig. 1). On average, BTS effluent dioxane levels measure less than 25 µg/L. THF was also monitored since some SDIMO enzymes are known to co-metabolize dioxane alongside THF. THF concentrations in the BTS influent ranged from 16,000 µg/L to 31,000 µg/L on the four days sampled (median across the three-year sampling period = 21,000 µg/L) (Fig. 1).
Input dioxane concentrations to the sedimentation tank of the Lowry Landfill Biological Treatment System (BTS). Values for each of the four dates sampled for shotgun metagenomics are shown alongside the distribution (kernel density estimate and quartiles/whiskers) of values from regular weekly sampling between the first and last sampling date (n = 196 weekly measurements). The median concentration for dioxane weekly measurements over the 3-year sampling date range was 19,000 µg/L (A). The median concentration for THF, a common substrate for SDIMO enzymes that co-metabolize 1,4-dioxane, was 21,000 µg/L (B).
Microbial community composition of bioreactor support media
To describe the microbial community composition of Bioreactor 1, 16S rRNA gene sequences were analyzed from 23 Bioreactor 1 support media samples (Table S3) over the selected timepoints (2019-03-26, n = 5; 2022-01-18, n = 7; 2022-01-25, n = 6; 2022-03-22, n = 5). A total of 884 ASVs were identified across this data set. Alpha diversity analyses across the samples showed Observed Richness ranging from 113 to 508, evenness (Shannon metric) from 3.8 to 4.5, Simpson metric from 0.04 to 0.20, and phylogenetic diversity (Faith PD metric) from 11 to 36 (Fig. S3; Table S3). Nonmetric multidimensional scaling (NMDS) beta diversity analyses based on Unifrac distances showed overall community stability across samples and time points (Fig. S4). The 2019 samples had somewhat different richness and diversity estimates from the 2022 samples and clustered independently in the NMDS plots but overall had similar taxonomic groups. This suggests that the overall community membership was similar, but there were some slight differences in community structure. Overall, the microbial communities were very stable over time.
Microbial community composition was diverse across multiple taxonomic levels. Across the quartered support media, microbial community composition was generally similar (Fig. S5). This trend was also observed after grouping support media by sampling date (Fig. 2). A total of 31 phyla were identified across all samples. At the phylum level, communities were consistently dominated by Proteobacteria (34.1%–53.5% per support media quarter), followed by Nitrospirota (7.4%–23.9%), Actinobacteriota (5.5%–17.5%), and Crenarchaeota (4.7%–15.6%) (Fig. S5A). Other commonly occurring phyla included Acidobacteriota, Planctomycetota, Bacteroidota, Verrucomicrobiota, Myxococcota, and Chloroflexi (Fig. S5A). At the family level, approximately 196 families were identified in the bioreactor samples. The most abundant families included Nitrospiraceae (7.3%–23.8% per support media quarter), Nitrososphaeraceae (4.6%–15.6%), Nitrosomonadaceae (4.8%–13.2%), Pseudonocardiaceae (1.8%–8.4%), Hyphomicrobiaceae (2.6%–6.3%), Parvularculaceae (1.9%–7.5%), Steroidobacteraceae (3.2%–5.4%), Saprospiraceae (1.7%–3.8%), Rhodocyclaceae (1.1%–4.3%), and Methyloligellaceae (0.4%–3.4%) (Fig. S5B; Fig. 2A). At the genus level, common taxa included Nitrospira (7.3%–23.8% per support media quarter), Candidatus Nitrocosmicus (4.6%–15.6%), mle1-7 (4.8%–13.1%), Amphiplicatus (1.9%–7.5%), and Pseudonocardia (1.8%–8.4%) (Fig. S5C; Fig. 2B). Other genera included Steroidobacter, Hyphomicrobium, Sulfuritalea, Subgroup_10, Amb-16S-1323, Phaeodactylibacter, OM190, Ga0077536, 67-14, Dadabacteriales, Haliangium, Pedomicrobium, Nocardiodes, Dongia, and Marmoricola (Fig. S5C; Fig. 2B). Overall, the taxonomy showed stability over time and across individual samples.
Microbial community composition bar plot of the top 20 most abundant families (A) and genera (B) in Lowry Landfill Bioreactor 1 support media quarters (n = 23), after grouping these by sampling date. "All other families" and "All other genera" represent taxa that were not among the top 20 most abundant taxa.
Many of the ASVs in the data set represent potential novel groups: 150 were identified as “uncultured” at the genus level (with no genus identified), 71 were identified only down to the family level (with no genus), 10 identified only to the order level, 18 identified only to the class level, nine identified only to the phylum level, and 29 identified only to the domain level.
Several ASVs were taxonomically related to bacterial genera containing previously reported dioxane degraders: 8/884 ASVs were identified as Pseudonocardia (two of these ASVs were present in 23/23 samples; the relative abundance of the Pseudonocardia genus ranged from 1.8% to 8.4% per support media quarter), 4/884 as Mycobacterium (one of these ASVs was present in 23/23 samples; Mycobacterium ranged from 0.4-0.9%), 2/884 as Flavobacterium (present in 4/23 samples total; Flavobacterium ranged from 0% to 0.03%), 2/884 as Acinetobacter (present in 2/23 samples total; Acinetobacter ranged from 0% to 0.1%), 3/884 as Pseudomonas (present in 6/23 samples total; Pseudomonas ranged from 0% to 0.3%), and 1/884 as Afipia (present in 5/23 samples total; Afipia ranged from 0% to 0.2%). These results suggest that bacteria that were closely related to known dioxane degraders were present across Lowry support media.
SDIMO diversity and abundance
To describe the bacterial proteins potentially involved in dioxane degradation in Bioreactor 1, predicted proteins on Lowry contigs assembled from 12 shotgun sequencing samples (Table S4) were searched against 39 known SDIMO alpha hydroxylases (15) using BLASTp. This revealed a phylogenetically diverse set of 86 Tier 1 SDIMO alpha hydroxylases in the support media. Per sample, Tier 1 SDIMOs ranged from 9.94 × 10⁻⁵−2.54 × 10⁻⁴% of predicted proteins. After clustering the contigs containing the 86 SDIMO alpha hydroxylases, a nonredundant set of 57 contigs was obtained (Fig. S2). SDIMOs pertaining to Groups I (n = 2), II (n = 4), IV (n = 7), and V (n = 73) were observed (Fig. S6). Three SDIMO groups of candidate dioxane-degrading proteins (CDDPs) were identified based on clading patterns, each of which was assigned one representative Lowry sequence (Fig. 3): Group II CDDPs (n = 3), Group V, PrmA*-like CDDPs (n = 13), and Group V, DxmA-like CDDPs (n = 38). Additional protein sequences phylogenetically grouped with a Group IV SDIMO (a putative alkene monooxygenase, n = 7), as well as outgroups for Group I (n = 2), Group II (n = 1), and Group V, PrmA*-like outgroups (n = 22). Group I sequences were assigned to K16242 (dmpN, poxD, tomA3; phenol/toluene 2-monooxygenase (NADH) P3/A3) and Group II sequences to K15760 (tmoA, tbuA1, touA; toluene monooxygenase system protein A). In Group IV, 6/7 sequences were assigned to K18223 (prmA; propane 2-monooxygenase large subunit) with a second possible assignment to K22353 (etnC, alkene monooxygenase alpha subunit). One Group IV sequence (LOWRY-S8_k127_2122582_2) lacked a KO assignment. Like other Group IV sequences, LOWRY-S8_k127_2122582_2 was annotated as “Methane/Phenol/Toluene Hydroxylase [PF02332.21]” by PFAM and “Methane monooxygenase component A alpha chain (EC 1.14.13.25)” by RAST, suggesting this sequence may still show monooxygenase activity. Group V sequences were assigned to K18223. Although previous studies have found CDDPs pertaining to Group III (K16157: mmoX; methane monooxygenase component A alpha chain) (18) and Group VI (82), no Lowry proteins curated in this protocol were identified as such. Thus, Tier 1 included all SDIMO groups, except Groups III and VI.
Manually clustered protein phylogenetic tree of seven representative sequences recovered from Lowry Landfill Bioreactor 1 support media and 39 previously described candidate SDIMOs (15). Phylogenies of candidate SDIMOs are included in their sequence names. Protein abundances were estimated using coverages (RPKM) of the 86 original Lowry contigs containing SDIMOs, summed by clade and by sample, and are displayed in the heatmap. Proteins that are presumed not to degrade dioxane due to monophyletic clading with known outgroups are written in gray text. The branches for OUT2 and CDDP4 were dashed as these showed monophyletic clading that was unexpected based on their prior description (15). Red stars indicate sequences that have shown direct metabolism of dioxane in the literature. Empty circles indicate sequences containing a gap in the first di-iron center of the SDIMO alpha hydroxylase (i.e., missing the first DERH motif). Clades that contained at least one sequence with an inverted terminal repeat or a plasmid score above the default threshold according to geNomad were marked with green triangles or blue squares, respectively. The SDIMO group number, number of sequences per phylogenetic clade, and KO assignments of representative sequences are listed in clade boxes.*
To compare the prevalence of these proteins, abundances were estimated using contig coverage (RPKM) associated with the 86 Tier 1 sequences (Fig. 3). Within each monophyletic clade, the most abundant contig that did not show a gap in the first di-iron center of the alpha hydroxylase was chosen as a representative and displayed in the tree. SDIMO classification was confirmed for these sequences in protein alignments using hydrophobic residues surrounding the first di-iron center region (see Methods and Fig. S7). When abundances were summed across all 12 samples by group, Group V sequences showed the highest total abundance (105.4 RPKM), followed by Group IV (1.1 RPKM), Group II (0.5 RPKM), and finally Group I (0.3 RPKM). Group V sequences were present across all 12 samples. DxmA-like CDDPs were also found in all 12 samples and were the most abundant clade. DxmA-like summed abundances ranged from 0.7 to 8.7 RPKM per sample and had a total abundance of 68.1 RPKM. The single most abundant Lowry sequence was LOWRY-S2_k127_1216267_5 (6.4 RPKM). This was closely related to DxmA-like CDDPs and was selected as the representative sequence for this clade (Fig. 3). This shared 99% amino acid identity with DxmA from known dioxane-degrading strain CB1190 (WP_103383250.1, CDDP1) and 100% amino acid identity with a partial SDIMO alpha hydroxylase subunit in Pseudonocardia sp. D17 (BAU36819.1, COMP7) and a methane monooxygenase from Pseudonocardia sp. N23 (WP_098956496.1, COMP9). All three of these proteins have shown direct metabolism of dioxane (15, 18, 59, 69, 83). The LOWRY-S2_k127_1216267_5 sequence also shared high amino acid identity with DxmA-like sequences that showed co-metabolic dioxane degradation in the presence of THF (15, 84). For example, this shared 99.8% identity with the alpha hydroxylase in Pseudonocardia sp. ENV478 (AEI99544.1, COMP8). Overall, Group V was the most abundant SDIMO group in Tier 1, with certain proteins sharing high amino acid identity to directly metabolic and co-metabolic DxmA-like CDDPs.
Because several Lowry sequences claded with DxmA-like CDDPs, these were analyzed further to describe the protein and gene sequence diversity of this clade. Clustering the alpha hydroxylases within the 57 nonredundant contigs resulted in multiple DxmA-like CDDP representatives, even at various stringency levels of sequence similarity (Table 1). For example, after clustering the 57 alpha hydroxylase proteins from the nonredundant contigs at 100% amino acid identity, 38 nonredundant proteins were obtained. Of these proteins, 14 claded with DxmA-like CDDPs. After clustering the 57 proteins at 90% amino acid identity, 2/10 remaining proteins claded with DxmA-like CDDPs. At the nucleotide level, clustering the 57 alpha hydroxylase genes at 100% nucleotide identity resulted in 45 nonredundant genes. Of these, 15 were identified as DxmA-like genes. Clustering at 90% nucleotide identity resulted in 19 genes, seven of which were identified as DxmA-like genes (Table 1). DxmA-like protein and gene sequences were consistently clustered into multiple groups, illustrating the amino acid and nucleotide sequence diversity of this one clade.
Other CDDPs identified in the Lowry samples included Group V, PrmA*-like CDDPs, represented by LOWRY-S2_k127_38155_23 (Fig. 3). This shared 96.4% amino acid identity with CDDP8, the co-metabolic PrmA* protein identified in Rhodococcus sp. RR1 (ADM83577.1) that was associated with dioxane degradation after induction with toluene (18). Other co-metabolic SDIMOs found in this clade included PrmA*-like proteins from Rhodococcus jostii RHA1 (Q0SJK9.1, COMP13; WP_011593714.1, COMP16), a bacterium that degraded dioxane in the presence of propane and 1-butanol growth substrates (85). PrmA*-like CDDPs were detected in all 12 samples at relatively low summed abundances (0.1–1.1 RPKM per sample) and a total abundance of 6.4 RPKM. Group II CDDPs were also detected. These were represented by LOWRY-S6_k127_1602557_1 (Fig. 3), which claded and shared 95.8% amino acid identity with CDDP3, the co-metabolic TmoA protein identified in Azoarcus sp. DD4 (QDF97112.1) that degraded dioxane after induction with toluene (86). Group II CDDPs were only detected in a subset of the samples at low summed abundances (0.1–0.2 RPKM per sample, total abundance = 0.4 RPKM). Lastly, Group IV SDIMOs were detected. However, these have not been associated with dioxane degradation to date (25). These sequences were represented by LOWRY-S11_k127_31664_3 (Fig. 3), which shared 96.6% amino acid identity with a putative alkene monooxygenase alpha subunit in Mycolicibacterium rhodesiae JS60 (AAO48576.1, COMP11). Group IV SDIMOs were detected in a subset of the samples at low summed abundances (0.1–0.2 RPKM per sample, total abundance = 1.1 RPKM). The presence of these SDIMOs provides evidence for other potential dioxane co-metabolism mechanisms and for biodegradation of other chemicals in the Lowry bioreactor.
Initial evaluation of potentially novel SDIMOs
Using less strict criteria, Tier 2 was developed from Lowry Landfill Bioreactor 1 proteins and represented those with less evidence from sequence homology of encoding SDIMO alpha hydroxylases. Tier 2 consisted of 622 potential SDIMOs. After filtering out Tier 1 contigs, clustering resulted in 355 nonredundant contigs containing a total of 367 potential SDIMOs (Fig. S2). An unrooted protein phylogenetic tree of the 367 potential SDIMOs and 39 previously described SDIMOs (15) revealed a phylogenetically diverse set of sequences predicted to pertain to Groups I (n = 130), II (n = 169), III (n = 1), IV (n = 18), V (n = 42), and VI (n = 7) (Fig. S8). The Tier 2 potential SDIMO proteins were closely related to diverse CDDPs: Group I TomA3-like CDDPs, Group II CDDPs, Group III sMMO-like CDDPs, Group V PrmA*-like CDDPs, and Group VI PrmA-like CDDPs. Therefore, this protocol curated additional, more phylogenetically diverse SDIMOs that were in the bioreactor. Further efforts are needed to determine the function of these proteins, but they may represent novel, uncharacterized SDIMOs involved in dioxane or other co-contaminant biodegradation.
Initial evaluation of potential intermediate-degrading enzymes
To determine whether proteins potentially involved in dioxane intermediate degradation were present, predicted proteins were searched against 112 sequences that either coded for previously proposed dioxane degradation pathway enzymes and/or were upregulated in isolates or consortia during dioxane degradation (20, 21, 69) (Tables S5 and S6). Multiple homologs to suspected intermediate-degrading enzymes were detected (Tables S5 and S6; Fig. 4; Fig. S9), e.g., alcohol dehydrogenase and aldehyde dehydrogenase were detected (Tables S5 and S6), involved in forming 1,4-dioxane-2-one and 2-hydroxyethoxyacetic acid, respectively, following the initial hydroxylation of dioxane by a monooxygenase (20, 21, 69). Etherase was detected (Tables S5 and S6), which may transform these intermediates into ethylene glycol and glycolic acid, respectively (20). Aldehyde reductase, alcohol oxidoreductase, and glycolate oxidase, potentially involved in glyoxylate formation (20, 21, 69), were found. Previous studies show that glyoxylate intermediates may enter the TCA cycle through two different pathways: through the formation of pyruvate using glyoxylate carboligase and glycerate kinase or through the formation of malate using malate synthase and 2-isopropylmalate synthase (21, 69). Importantly, all four of these enzymes immediately preceding the TCA cycle were detected (Table S5), suggesting that microorganisms in the Lowry bioreactor potentially have the ability to degrade dioxane to carbon dioxide (21, 69).
Nucleotide alignment of overlapping contigs. Blue columns indicate the 127 base-pair stretches where the assembly broke. Predicted proteins and annotations (color-coded by function; see inset) are shown for the two longest contigs.
Additionally, other non-enzyme sequences that were previously upregulated in CB1190 during dioxane degradation in a microarray experiment (69) were also detected in the Lowry samples (Table S6; Fig. 4; Fig. S9). This included GntR family transcriptional regulators, manganese transport proteins, and TetR regulatory proteins (Table S6). This may suggest that similar dioxane degradation mechanisms used by CB1190 also commonly occur in the Lowry bioreactor. Future experiments will need to confirm whether these inferred genes are actively expressed during dioxane degradation in Lowry bioreactor microorganisms.
Search for evidence of dioxane degradation pathway steps and horizontal gene transfer in gene neighborhoods
We sequenced and assembled Lowry contigs to evaluate the SDIMOs and CDDPs in a genomic context. However, in closely evaluating the sequence data, we noticed that the assembly was breaking across several of the relevant contigs. For instance, assembly breaks were noted in the S3_k127_1715793 and S2_k127_1904549 contigs, which were each longer than 27,000 nucleotides and contained abundant Group V, DxmA-like proteins (Fig. 4). An additional 20 shorter contigs mapped to these longer contigs with high identity (~99% nucleotide identity across all contigs), including contigs from all samples except for S9. Curiously, there were four separate regions of overlap in the alignment where the assembly appeared to break. The four sequence overlaps were each exactly 127 bp long, but each contained different nucleotide sequences, suggesting micro-variation in the population. Three out of the four sequence overlaps occurred near the end of a protein-coding gene.
Given the challenges with genome assembly, our analysis of SDIMO gene neighborhoods focused on contigs (instead of metagenome-assembled genomes). Based on their high abundance in the Lowry system, we further evaluated the DxmA-like CDDP-containing contigs (n = 17 out of the 57 nonredundant contigs) to look for phylogenetic relationships to other known dioxane degraders, evidence of other SDIMO components, dioxane degradation pathway steps, and horizontal gene transfer (Fig. 4; Fig. S9).
Candidate dioxane-degrading contigs identified in a previous survey (15) were compared to the Lowry contigs. The Lowry contigs were phylogenetically related to the Pseudonocardia contigs with strong bootstrap support (>60 for most of the contigs) including Pseudonocardia sp. N23 (NZ_BEGX01000008), Pseudonocardia sp. ENV478 (HQ699618), Pseudonocardia tetrahydrofuranoxydans strain K1 (AJ296087), and Pseudonocardia dioxanivorans EUR-1 (NZ_SJXF01000101). One additional sequence was used as an outgroup by extracting just the SDIMO subunits: Pseudonocardia asaccharolytica DSM 44247 (NZ_AUII01000002). This contig-level phylogeny aligned with the protein-level phylogeny showing the individual Lowry DxmA-like alpha subunit proteins phylogenetically clustering with Pseudonocardia alpha subunit proteins (Fig. S6).
The gene neighborhood analysis showed that the SDIMO subunit gene order was conserved across nearly all 17 of the Lowry DxmA-like CDDP-containing contigs, only excluding contigs that did not extend over the entirety of the subunits. For each of the complete contigs (containing each of the subunits), the SDIMO subunits were ordered as dioxane monooxygenase alpha subunit, reductase component, beta subunit, and regulatory/coupling component, which has been previously observed in Group V SDIMOs (86). Outside of the SDIMO subunits, predicted semi-aldehyde dehydrogenase, EthD domain, and GntR family transcriptional regulator proteins were located upstream. EthD is a component of a cytochrome P-450 gene, which was associated with the degradation of ethyl tert-butyl (ETBE) in Rhodococcus rubber IFP 2001 (87). Downstream of the SDIMO subunits were predicted GPR1/FUN34/yaaH family transporters. Putative proteins occurred at the ends of the contigs, including putative metal transporters and putative transposases. Contigs identified in a previous study (15) had the exact same order of SDIMO subunits as Lowry contigs and nearly the same upstream and downstream proteins. The contig from Pseudonocardia dioxanivorans (NZ_SJXF01000101) extended further downstream than any of the Lowry contigs, including an Nramp family divalent metal transporter in place of the putative metal transporters seen in the Lowry contigs, followed by an IS1634 family transposase, tyrosine-type recombinase/integrase, zeta toxin family protein, and two hypothetical proteins.
The Lowry DxmA-like contigs had several genes commonly associated with horizontal gene transfer. Several of these contigs (13/17) were assigned a plasmid identification score above the default threshold (0.7) by geNomad (77) (Fig. S6). Toxin-antitoxin proteins were predicted in the annotated Lowry contigs (S3_k127_1715793, S2_k127_1904549, and S9_k127_1368382). Specifically, TAfinder 2.0 (79) identified a set of Type II toxin-antitoxin phd-doc (prevent host death/death on curing) family proteins, which have been proposed to be involved in plasmid stabilization and many other functions such as stress modulation and antibiotic persistence (88, 89). Other annotation methods (PFAM and BLASTp) identified an additional set of Type II toxin-antitoxin proteins (though not identified by TAfinder 2.0).
According to geNomad, no Lowry proteins had a virus score above the default threshold (0.7), though annotation methods did identify phage integrase proteins likely related to tyrosine family integrases that utilize a catalytic tyrosine to mediate strand cleavage (90).
Putative transposases in the Lowry contigs (S3_k127_1715793_29, S9_k127_1368382_1, and possibly the truncated S2_k127_1904549_1) appeared to be related to the IS5 family of insertion sequences based on BLASTp results (against nr) and TnCentral (78), which have been shown to cause mutations in E. coli and are found in taxonomically diverse microbes (78). The S2_k127_1904549 contig had predicted transposases at the beginning and end of the contig (S2_k127_1904549_1 and S2_k127_1904549_29), which were 100% identical but in reverse complement orientation (geNomad identified this region as an Inverted Terminal Repeat; Fig. S6). The S3_k127_1715793 contig was dissimilar and had a hypothetical protein at the end of the contig instead of another transposase. Complete insertion sequences or transposons were not identified in any of the Lowry contigs.
Putative miniature inverted-repeat transposable elements (MITEs; short, non-autonomous transposable elements that can contribute to genetic variability and regulation) sequences were identified in the longest Lowry DxmA-like CDDP contigs (S3_k127_1715793, S9_k127_1368382, and S2_k127_1904549) using MITE Tracker (80). The three contigs each had many predicted MITEs across the length of the sequence. Many of the MITEs were overlapping, so the total number of predicted MITEs was uncertain. MITEs were predicted within protein-coding sequences and within intergenic regions. Two of the four 127 bp overlapping regions within the broken assembly (Fig. 4) were within predicted MITEs. The Pseudonocardia contigs (NZ_SJXF01000101.1, HQ699618.1, AJ296087.1, NZ_BEGX01000008.1) also contained several MITEs, though fewer than the Lowry contigs in the regions that aligned. The Pseudonocardia and Lowry contigs shared a few predicted MITEs with similar structure and genomic location. Further research is needed to confirm whether or not the predicted MITEs are functional.
DISCUSSION
The microbial communities of the Lowry Landfill Bioreactor 1 are diverse and stable over the sampled 3-year period and selected shorter timescales. Our 16S rRNA gene sequence data show that a number of taxa fall within genera that also contain known dioxane degraders. Of those genera, we detected (from highest to lowest 16S rRNA relative abundances per support media quarter) Pseudonocardia (Actinobacteriota), Mycobacterium (Actinobacteriota), Pseudomonas (Proteobacteria), Afipia (Proteobacteria), Acinetobacter (Proteobacteria), and Flavobacterium (Bacteroidota). Therefore, potential dioxane degraders in the bioreactor originated from diverse bacterial phyla. Interestingly, certain strains in the Proteobacteria phylum (e.g., Afipia sp. strain D1 and Acinetobacter baumannii DD1) show direct metabolism of dioxane in culturing studies (91, 92), but their degradation mechanisms have not yet been described (15). Detection of these genera in the Lowry bioreactor may indicate that additional degradation mechanisms may have been present but were missed in the current analyses that relied on homology to known and inferred SDIMO-degrading enzymes. Also, Pseudonocardiaceae and Rhodocyclaceae, families detected in our system, each contain species that have shown either direct metabolism or co-metabolism of dioxane (15, 16). Taxa in the Hyphomicrobiaceae and Nocardiodaceae families are present in the bioreactor and were previously shown to increase in abundance during dioxane degradation in a microcosm experiment (28). Although the potential Lowry dioxane degraders were detected in support media, they were not the dominant taxa. Instead, other taxa within the Nitrospiraceae, Nitrososphaeraceae, and Nitrosomonadaceae groups were highly abundant in the support media communities, suggesting an important role for nitrogen cycling. Overall, the 16S data provide first insights into a complex but relatively stable community within the bioreactors. The relationship between dioxane degradation and the other important metabolisms (e.g., nutrient cycling and degradation of other known contaminants in the reactor) in the community remains to be discovered.
From the Tier 1 SDIMO protein data set, Group V SDIMOs dominated in the Lowry Landfill bioreactor support media bacterial communities. Protein phylogenetic trees and alignments showed Group V SDIMOs consisting of two CDDPs: DxmA-like proteins and PrmA*-like proteins. Of these, the DxmA-like proteins were present across all Lowry support media microbiomes, usually as the most abundant SDIMO protein. Some of these proteins shared high amino acid sequence identity with known DxmA-like proteins, including those from Pseudonocardia dioxanivorans str. CB1190 (WP_103383250.1, CDDP1), Pseudonocardia sp. D17 (BAU36819.1, COMP7), and Pseudonocardia sp. N23 (WP_098956496.1, COMP9), all of which have shown direct metabolism of dioxane (15, 18, 59, 69, 83). Phylogenetic trees showed that some co-metabolic enzymes requiring induction by THF to degrade dioxane, for example, an SDIMO from Pseudonocardia sp. ENV478 (AEI99544.1, COMP8) (15, 84), claded with DxmA-like directly metabolic enzymes and Lowry proteins (Fig. 3). Therefore, it is possible that some Lowry proteins in this clade similarly require THF to degrade dioxane co-metabolically. In the BTS, THF concentrations are being monitored as it has been a suspected growth substrate for co-metabolic dioxane degradation. However, the presence of THF-induced co-metabolic degradation of dioxane in the BTS or in the dioxane plume surrounding the facility has not been tested to date. Furthermore, studies have not been able to definitively identify gene sequences or components resulting in either direct metabolism or co-metabolism of dioxane (15). Therefore, Lowry DxmA-like sequences could not be searched for signatures of THF co-metabolism. As a result, although DxmA-like proteins were frequently detected in support media, further experiments are necessary to test not only their dioxane degradation ability but also their potential degradation mechanism.
The DxmA-like contigs contained many genomic signatures of potential mechanisms for horizontal gene transfer (HGT), including toxin-antitoxin systems, phage integrases, recombination endonucleases, putative transposases, and putative miniature inverted-repeat transposable elements (MITEs). Also, DxmA-like contigs were commonly assigned high geNomad plasmid scores (>0.7), suggesting association with a plasmid. The large number of HGT signatures and diverse mechanisms of function (e.g., plasmids and transposition) within genomic proximity to the SDIMO protein coding genes suggests that these genes may be mobile within the community. The functional consequences of these potential HGT signatures are unknown but could play a role in the microbial evolution within the reactors, population-level variability, and regulation of these important functional genes. For instance, MITEs have been shown to play a role in genome plasticity and gene regulation in bacteria (93, 94). The environmental context of the Lowry bioreactors (a complex contaminant mixture) may put a selection pressure on the community to move the SDIMO genes within the population in order to maintain functional stability. These findings would be interesting to follow up on in the context of dioxane remediation and regulation.
Group V, PrmA*-like proteins were also frequently detected in the Lowry Landfill bioreactor support media, although at lower abundances than the DxmA-like proteins. These shared high amino acid sequence identity with PrmA*, which was associated with co-metabolic degradation induced by toluene in Rhodococcus sp. RR1 (18). Other co-metabolic PrmA*-like proteins in this clade were induced by propane and 1-butanol (Q0SJK9.1, COMP13; WP_011593714.1, COMP16) (85). Although propane and 1-butanol are not specifically monitored in the BTS, toluene is measured in BTS effluent. However, whether any of these chemicals induce co-metabolic dioxane degradation by PrmA*-like enzymes in the bioreactors remains an open question. Because these proteins were detected to a lesser extent, it is possible that dioxane degradation by PrmA*-like enzymes is a less common mechanism in the support media bacterial communities.
Compared to Group V, other SDIMO groups were detected more infrequently and to a much lesser degree. Of these, Group II SDIMOs were detected at three orders of magnitude below the Group V SDIMOs. Some Group II SDIMOs claded with the CDDP TmoA, which was induced to degrade dioxane co-metabolically with toluene in Azoarcus sp. DD4 (QDF97112.1; CDDP3) (86). Whether dioxane co-metabolism induced by toluene and mediated by TmoA-like enzymes is present in the BTS has not been tested. Additionally, Group IV SDIMOs were detected, although only at a total abundance that was two orders of magnitude lower than that of the Group V SDIMOs. Although Group IV SDIMOs have not been shown to degrade dioxane to date, these proteins have been detected in dioxane-impacted groundwater surrounding industrial sites in Arizona (25) and Alaska (26).
Interestingly, Tier 1 Lowry proteins, those that we were most confident encoded SDIMOs, contained neither Group III nor Group VI SDIMOs. A study on another dioxane-impacted landfill detected sMMO, a Group III SDIMO associated with methane-induced co-metabolic dioxane degradation in OB3b (18), in most landfill monitoring wells by qPCR (95). In the Lowry Landfill, it is possible that bioreactor conditions caused a lack of sMMO-like proteins. For example, sMMO is often detected under copper-limited conditions and may be replaced by particulate MMO (pMMO) in methanotrophs when copper is not limited (96). The lack of Group VI SDIMOs, like PrmA, is also of note. Prm is a directly metabolic dioxane-degrading enzyme first identified in Mycobacterium dioxanotrophicus PH-06 (WP_087083743.1) (82). Compared to the Dxm of CB1190, Prm has shown a higher dioxane degradation rate, broader substrate range, and less susceptibility to inhibition (97).
Tier 2 SDIMOs showed broader phylogenetic diversity than Tier 1. Tier 2 uncovered potential SDIMOs that were closely related to CDDPs not seen in Tier 1, namely, Group I TomA3-like CDDPs, Group III sMMO-like CDDPs, and Group VI PrmA-like CDDPs. This illustrates other possible CDDPs in the Lowry Landfill system that could later be studied for dioxane degradation ability.
The detection of enzymes involved in intermediate degradation provides preliminary evidence for the complete degradation of dioxane to carbon dioxide by the Lowry bioreactor microorganisms. Of note, we detected enzymes associated with the transformation of dioxane intermediates into glyoxylate, a major intermediate in dioxane degradation by CB1190 (69). We also detected enzymes from two different pathways that may transform glyoxylate into intermediates that can enter the TCA cycle, revealing a potential mechanism for complete dioxane degradation to carbon dioxide. Interestingly, the glyoxylate carboligase pathway, previously reported to support the growth of CB1190 during dioxane degradation (69), was found in our system. Furthermore, detection of these and other sequences upregulated in CB1190 upon dioxane exposure (69) may indicate that dioxane degradation mechanisms in the Lowry bioreactor are similar to those of CB1190. However, further experiments are needed to both identify and quantify the expression of genes that are actively used during dioxane degradation by Lowry bioreactor microorganisms.
Several studies have attempted to correlate the presence of different SDIMOs with dioxane degradation ability of bacterial communities. The current study found a potential dominance of DxmA-like proteins in the Lowry Landfill bioreactor support media. This contrasted slightly with the findings of Goff and Hug (15), which predicted that DxmA was rare and largely restricted to terrestrial environments while engineered environments were expected to show higher abundances of Tmo (Group II), Tom (Group I), and Tbu (Group II). However, in dioxane-impacted groundwater in Alaska, DxmA-like proteins were detected throughout a dioxane plume but were most prevalent near the source of contamination. Certain Group I SDIMOs were also detected at the source (26). Another study enriched dioxane-degrading microbial communities from two uncontaminated soils. Both soils began with diverse communities and SDIMOs, which were then dominated by Mycobacterium and Group V and VI SDIMOs after dioxane addition (98). Also, a study attempting to enrich dioxane-degrading consortia found that Group V, PrmA*-like and Group I, TomA3-like proteins were most abundant in both uncontaminated and contaminated soils. Although there was no statistically significant difference in the abundances of these proteins before and after dioxane addition, Mycobacterium was among one of the taxa enriched after dioxane addition (28). Like the current study, another used shotgun sequencing to characterize the SDIMOs and microbial communities of chlorinated solvent-impacted sites. However, these sites were bioaugmented with SDC-9 (Shaw Environmental Inc., Lawrenceville, New Jersey), a dechlorinating bacterial consortium, and one site was confirmed to show dioxane contamination. Across those systems, the most abundant SDIMOs belonged to Groups III, I, and then II. Pseudonocardia, its associated DxmA, Mycobacterium and its associated PrmA, were not detected in any of those sites (27). Therefore, while certain studies have also found a prevalence of Group V SDIMOs in dioxane-impacted sites, others have shown dioxane-degrading microbial communities that may utilize other SDIMO groups. These discrepancies may be due to environmental conditions such as redox potential (28).
Conclusions
Support media from Lowry Landfill Bioreactor 1 showed stable and diverse microbial communities that contained potential dioxane degraders, namely, Pseudonocardia. Lowry proteins that we were most confident encoded SDIMOs mostly belonged to Group V, with DxmA-like proteins being the most prevalent. Contigs containing DxmA-like Lowry proteins showed evidence of a range of horizontal gene transfer mechanisms. These may indicate a selection pressure in the bioreactor microbial communities to maintain Dxm. Additionally, other SDIMO groups were detected, showing the potential phylogenetic diversity of this one enzyme family in the bioreactor. Future studies may isolate dioxane-degrading species or simplified consortia from the bioreactor, test the dioxane degradation ability and mobility of the DxmA proteins from this work, and elucidate the overall microbial ecology of the Lowry Landfill BTS.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1EPA. 2017. Technical fact sheet – 1,4-dioxane (EPA-505-F-17-011)
- 2EPA.2006. Treatment technologies for 1,4-dioxane: fundamentals and field applications.
- 3Godri Pollitt KJ, Kim J-H, Peccia J, Elimelech M, Zhang Y, Charkoftaki G, Hodges B, Zucker I, Huang H, Deziel NC, Murphy K, Ishii M, Johnson CH, Boissevain A, O’Keefe E, Anastas PT, Orlicky D, Thompson DC, Vasiliou V. 2019. 1,4-Dioxane as an emerging water contaminant: state of the science and evaluation of research needs. Sci Total Environ 690:853–866. doi:10.1016/j.scitotenv.2019.06.44331302550 · doi ↗ · pubmed ↗
- 4Mohr T, Di Guiseppi W, Hatton J, Anderson J. 2020. Environmental investigation and remediation: 1,4-dioxane and other solvent stabilizers. 2nd ed. Boca Raton. https://doi-org.aurarialibrary.idm.oclc.org/10.1201/9780429401428.
- 5Mc Elroy AC, Hyman MR, Knappe DRU. 2019. 1,4-dioxane in drinking water: emerging for 40 years and still unregulated. Current Opinion Environ Sci Health 7:117–125. doi:10.1016/j.coesh.2019.01.003 · doi ↗
- 6EPA.2018. Problem formulation of the risk evaluation for 1,4-dioxane.
- 7Doherty AC, Lee CS, Meng QY, Sakano Y, Noble AE, Grant KA, Esposito A, Gobler CJ, Venkatesan AK. 2023. Contribution of household and personal care products to 1,4-dioxane contamination of drinking water. Current Opinion Environ Sci Health 31:100414. doi:10.1016/j.coesh.2022.100414 · doi ↗
- 8Zenker MJ, Borden RC, Barlaz MA. 2003. Occurrence and treatment of 1,4-dioxane in aqueous environments. Environ Eng Sci 20:423–432. doi:10.1089/109287503768335913 · doi ↗
