Metagenome-assembled genomes from microbial communities producing lactic acid from dairy residues
Faith Koester, Kevin S. Myers, Timothy J. Donohue, Daniel R. Noguera

TL;DR
This paper explores microbial communities in bioreactors that convert dairy waste into lactic acid, identifying 42 unique genomes.
Contribution
The study provides new metagenome-assembled genomes from anaerobic bioreactors processing dairy residues.
Findings
Forty-two unique metagenome-assembled genomes were identified from four anaerobic bioreactors.
The genomes represent distinct microbial taxa involved in fermenting dairy residues into lactic acid.
Abstract
To advance the knowledge of microbial communities capable of fermenting agro-industrial residues into value-added products, we report metagenomes of microbial communities from four anaerobic bioreactors fed a mixture of ultra-filtered milk permeate and cottage cheese acid whey. This analysis produced 42 unique metagenome-assembled genomes (MAGs) that represent distinct taxa.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
| MAG ID | Phylum | GTDB classification | NCBI classification | Reference genome | ANI | Completeness (%) | Contamination (%) | MAG size (bp) | Contigs | N50 (bp) | %GC | NCBI accession number | NCBI SRA accession number |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PSEUD1 | Pseudomonadota |
|
| GCF_001043025.1 | 97.28 | 100 | 0.63 | 5,950,624 | 1 | 5,950,624 | 58.07 |
|
|
| BIF1 | Actinomycetota |
|
| GCF_000738005.1 | 97.36 | 99.08 | 2.12 | 2,454,116 | 1 | 2,454,116 | 57.86 |
|
|
| LENTI1 | Bacillota |
|
| GCF_001435315.1 | 97.43 | 98.94 | 0 | 2,574,352 | 1 | 2,574,352 | 43.68 |
|
|
| LACTICAS1 | Bacillota |
|
| GCF_000829035.1 | 98.43 | 98.64 | 2.47 | 2,927,657 | 29 | 303,583 | 46.58 |
|
|
| LEUC1 | Bacillota |
|
| GCF_000014445.1 | 98.78 | 91.8 | 0 | 1,925,562 | 8 | 1,350,794 | 37.81 |
|
|
| CLOS1 | Bacillota A |
|
| N/A | N/A | 94.71 | 7.62 | 2,653,518 | 18 | 408,933 | 31.39 |
|
|
| CLOS2 | Bacillota A |
|
| GCF_014050525.1 | 99.37 | 99.59 | 0 | 2,617,478 | 1 | 2,617,478 | 30.11 |
|
|
| THERMO1 | Bacillota A |
|
| GCF_000145615.1 | 97.17 | 99.4 | 4.11 | 3,240,700 | 12 | 2,846,837 | 34.01 |
|
|
| SUTT1 | Pseudomonadota |
|
| N/A | N/A | 96.89 | 0 | 2,299,283 | 13 | 427,523 | 63.29 |
|
|
| ATO1 | Actinomycetota |
|
| GCA_900314535.1 | 96.94 | 99.19 | 0.81 | 2,622,559 | 2 | 2,593,049 | 60.25 |
|
|
| LACTIPLAN1 | Bacillota |
|
| GCF_014131735.1 | 96.07 | 99.38 | 2.78 | 3,078,116 | 2 | 2,813,036 | 45.26 |
|
|
| PSEUD2 | Pseudomonadota |
|
| GCF_900105835.1 | 98.55 | 86.74 | 0.27 | 4,465,640 | 40 | 177,538 | 59.39 |
|
|
| LACTO1 | Bacillota |
|
| GCF_001433875.1 | 97.69 | 99.03 | 0 | 2,184,090 | 1 | 2,184,090 | 49.14 |
|
|
| LACTO2 | Bacillota |
|
| GCF_000160855.1 | 97.68 | 99.03 | 0.32 | 2,234,393 | 1 | 2,234,393 | 36.92 |
|
|
| LIMO1 | Bacillota |
|
| GCF_013394085.1 | 98.91 | 96.86 | 0 | 2,025,386 | 14 | 154,414 | 51.82 |
|
|
| SUCC1 | Bacillota C |
|
| N/A | N/A | 98.58 | 0 | 2,160,246 | 1 | 2,160,246 | 55.98 |
|
|
| ACID1 | Bacillota C |
|
| GCF_900106585.1 | 96.53 | 99.4 | 0 | 2,286,539 | 1 | 2,286,539 | 54.95 |
|
|
| CAPRO1 | Bacillota A |
|
| GCA_012516295.1 | 99.7 | 97.99 | 0.27 | 2,530,718 | 1 | 2,530,718 | 48.71 |
|
|
| CAPRO2 | Bacillota A |
|
| GCA_024498295.1 | 99.27 | 97.32 | 0 | 2,007,289 | 1 | 2,007,289 | 49.47 |
|
|
| CAPRO3 | Bacillota A |
|
| N/A | N/A | 97.99 | 0 | 2,650,324 | 1 | 2,650,324 | 44.45 |
|
|
| CAPRO4 | Bacillota A |
|
| N/A | N/A | 97.99 | 0.67 | 2,802,872 | 6 | 1,505,020 | 45.14 |
|
|
| LACTOCO1 | Bacillota |
|
| GCF_900099625.1 | 98.52 | 100 | 0.28 | 2,595,605 | 1 | 2,595,605 | 35.1 |
|
|
| PSEUD3 | Pseudomonadota |
|
| GCF_001042985.1 | 98.05 | 94.32 | 0.11 | 4,386,151 | 7 | 913,462 | 58.66 |
|
|
| LACTOCO2 | Bacillota |
|
| N/A | N/A | 98.11 | 0.5 | 2,242,213 | 1 | 2,242,213 | 41.54 |
|
|
| LACTOCO3 | Bacillota |
|
| GCF_001591765.1 | 98.66 | 98.11 | 0.72 | 2,239,778 | 1 | 2,239,778 | 39.97 |
|
|
| VIRG1 | Bacillota |
|
| GCF_900162615.1 | 99.58 | 100 | 1.67 | 4,423,115 | 10 | 1,262,884 | 36.16 |
|
|
| ENT1 | Bacillota |
|
| GCF_000407545.1 | 97.95 | 98.85 | 3.4 | 3,781,005 | 5 | 2,660,674 | 42.13 |
|
|
| ENT2 | Bacillota |
|
| GCF_001544255.1 | 98.62 | 99.63 | 0 | 2,753,251 | 1 | 2,753,251 | 38.16 |
|
|
| DIAL1 | Bacillota C |
|
| GCA_900765565.1 | 96.99 | 99.59 | 0 | 2,944,120 | 1 | 2,944,120 | 51.49 |
|
|
| MEGA1 | Bacillota C |
|
| GCF_000417505.1 | 98.54 | 100 | 0 | 2,329,985 | 1 | 2,329,985 | 54.02 |
|
|
| SELE1 | Bacillota C |
|
| GCA_934202005.1 | 95.96 | 100 | 0 | 2,276,171 | 1 | 2,276,171 | 60.17 |
|
|
| MITSU1 | Bacillota C |
|
| GCA_900552565.1 | 95.27 | 99.07 | 0.14 | 2,353,814 | 1 | 2,353,814 | 53.79 |
|
|
| XYLA1 | Bacillota A |
|
| GCF_004138105.1 | 98.99 | 97.58 | 1.21 | 2,650,085 | 14 | 339,090 | 36.41 |
|
|
| SPOR1 | Bacillota A |
|
| N/A | N/A | 99.3 | 0.7 | 2,267,806 | 1 | 2,267,806 | 31.32 |
|
|
| PSEUD4 | Pseudomonadota |
|
| GCF_014268455.2 | 99.27 | 99.32 | 1.21 | 6,278,246 | 1 | 6,278,246 | 60.55 |
|
|
| BACI1 | Bacillota |
|
| GCF_002250115.1 | 97.89 | 98.67 | 2 | 4,672,023 | 1 | 4,672,023 | 44.46 |
|
|
| ANA1 | Bacillota A |
|
| N/A | N/A | 94.62 | 1.57 | 3,467,848 | 49 | 146,365 | 37.8 |
|
|
| HAF1 | Pseudomonadota |
|
| GCF_001655005.1 | 99.06 | 99.86 | 0.45 | 4,969,061 | 1 | 4,969,061 | 48.1 |
|
|
| SERR1 | Pseudomonadota |
|
| GCF_000422085.1 | 98.86 | 100 | 0.45 | 5,585,869 | 1 | 5,585,869 | 55.23 |
|
|
| RAHN1 | Pseudomonadota |
|
| GCF_003263515.1 | 98.97 | 99.84 | 0.08 | 5,186,034 | 4 | 3,490,194 | 53 |
|
|
| CITRO1 | Pseudomonadota |
|
| GCF_002075345.1 | 98.53 | 98.21 | 0.56 | 5,116,397 | 30 | 324,668 | 52.12 |
|
|
| PSEUDOCLA1 | Actinomycetota |
|
| GCF_008831125.1 | 99.16 | 98.07 | 0 | 2,066,244 | 5 | 507,308 | 71.16 |
|
|
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbiotics and Fermented Foods · Genomics and Phylogenetic Studies · Enzyme Production and Characterization
ANNOUNCEMENT
Microbial communities have the potential to increase the value of agro-industrial residues by fermentation. Metagenomes reported here originate from two sets of anaerobic bioreactors and their inoculant cultures. In each experiment, two chemostats were operated anaerobically, at 50°C, pH 5.5, 6-day solids retention time, and either a 6-, 3-, or 1-day hydraulic retention time. The chemostats were supplied with a 1:1 mixture of ultra-filtered milk permeate and cottage cheese acid whey and inoculated with a culture derived from an acid-phase digester at a wastewater treatment plant (Madison, WI, USA) that was first enriched in a similarly operated bioreactor for 26 days at 50°C.
For each experiment, weekly samples of DNA were extracted from the microbes in each bioreactor with the Qiagen Genomic Tip 20 /G (Qiagen) following the manufacturer’s protocol for “Preparation Gram‐negative and some Gram‐positive Bacterial Samples” with lysostaphin (4000 U mL^−1^, Sigma‐Aldrich) added. A total of 37 samples containing 3,000–10,000 ng DNA were sequenced on a Pacific Biosciences Sequel 2 platform by the Joint Genome Institute (JGI) (Berkeley, USA).
Processing and sequencing of DNA included shearing genomic DNA to either 3 kb or 6–10 kb, depending on DNA quality, and performing ligations using the SMRTbell Express template prep kit (Pacific Biosciences). Size selection was performed with BluePippin (Sage Sciences). Finally, the libraries were sequenced on Sequel II (Pacific Biosciences, Inc. [PacBio], Menlo Park, CA, USA) using Sequel Polymerase Binding Kit and Revio chemistry. Across all 37 samples, there was an average of 513,455 reads with a range from 341,988 reads to 828,421 reads. The average length was 9,120 bp (range of 8,037 bp to 10,220 bp), and the average N50 of the reads was 9,367 bp (range of 8,033 bp to 10,531 bp).
Metagenomic DNA quality checking, assembly, binning, and taxonomy classification were performed as follows, and default parameters were used except where otherwise noted. BBTools (v39.01) (1) was used to remove reads containing inverted repeats. Followed by assembly with MetaMDBG (v0.3) (2) (“metaMDBG asm assembly <fastq.gz> -t 4”), polished with Pbmm2 (v1.13.1) (3) and Racon (v1.5.0) (4) for consensus, aligned with minimap2 (v2.26-r1175) (5). Then, “pileup.sh” from BBTools (1) to calculate coverage and binning with MetaBAT (v2.15) (6). Quality was assessed with checkM (v1.2.2) (7), and MAGs were dereplicated using dRep (v3.4.5) (-N50W 2) (8). GTDB-Tk (v2.3.2, database release 214) (9) was used to assign taxonomy to representative MAGs (“gtdbtk identify”) based on concatenated bacterial marker genes (Bac120). For NCBI submission, some taxonomic changes were required as noted in Table 1. Bakta (v1.9.1) (10) was used for gene annotation of all MAGs.
This announcement reports 42 annotated MAGs that are classified as high- and medium-quality according to Bowers et al. (11), dereplicated and representing distinct taxa (Table 1). These data advance our metagenome-based knowledge of agro-industrial residue and waste bioconverting microbiomes.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Bushnell B. 2014. BB Map: a fast, accurate, splice-aware aligner. Available from: https://sourceforge.net/projects/bbmap. Retrieved 29 Sep 2024.
- 2Benoit G, Raguideau S, James R, Phillippy AM, Chikhi R, Quince C. 2024. High-quality metagenome assembly from long accurate reads with meta MDBG. Nat Biotechnol 42:1378–1383. 10.1038/s 41587-023-01983-638168989 PMC 11392814 · doi ↗ · pubmed ↗
- 3Pac Bio. pbmm 2: A minimap 2 frontend for Pac Bio native data formats. Available from: https://github.com/Pacific Biosciences/pbmm 2. Accessed 29 September 2024
- 4Vaser R, Sović I, Nagarajan N, Šikić M. 2017. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27:737–746. 10.1101/gr.214270.11628100585 PMC 5411768 · doi ↗ · pubmed ↗
- 5Li H. 2018. Minimap 2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100. 10.1093/bioinformatics/bty 19129750242 PMC 6137996 · doi ↗ · pubmed ↗
- 6Kang DD, Li F, Kirton E, Thomas A, Egan R, An H, Wang Z. 2019. Meta BAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. Peer J 7:e 7359. 10.7717/peerj.735931388474 PMC 6662567 · doi ↗ · pubmed ↗
- 7Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. 2015. Check M: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25:1043–1055. 10.1101/gr.186072.11425977477 PMC 4484387 · doi ↗ · pubmed ↗
- 8Olm MR, Brown CT, Brooks B, Banfield JF. 2017. d Rep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J 11:2864–2868. 10.1038/ismej.2017.12628742071 PMC 5702732 · doi ↗ · pubmed ↗
