High-quality draft genome sequences of seven Ralstonia spp. isolated from temperate forest soils
A. Li Han Chan, Mallory J. Choudoir, Achala Narayanan, Damayanti Rodriguez-Ramos, Grace Pold, Andrew F. Billings, Kristen M. DeAngelis

TL;DR
This paper presents seven high-quality genome sequences of Ralstonia bacteria from forest soils, providing insights into their genetic makeup.
Contribution
The study provides new high-quality draft genomes of Ralstonia species using hybrid and long-read assembly methods.
Findings
Seven Ralstonia genomes were sequenced with high completeness (minimum 92.6%).
The genomes have an average GC content of 63.45%.
Both hybrid and long-read assembly methods were used to generate the sequences.
Abstract
We report seven highquality draft genomes of Ralstonia spp. isolated from the Harvard Forest Long-Term Warming Experimental plots: four de novo hybrid assemblies and three de novo long-read assemblies. The genomes have a minimum estimated completeness of 92.6% and an average GC content of 63.45%.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
| AB22-23 | AB28-3 | AB36-13C | GP103 | GP71 | GP101 | GP95 | |
|---|---|---|---|---|---|---|---|
| Soil horizon | Mineral | Mineral | Mineral | Organic | Organic | Organic | Organic |
| Warming treatment | Control | Warm | Warm | Warm | Control | Warm | Control |
| Isolation media | 1% nutrient broth | 1% nutrient broth | 1% nutrient broth | VL55 plant | VL55 plant | VL55 plant | VL55 plant |
| Isolation atmosphere | 5% hydrogen, 5% | Aerobic | Aerobic-COY | Aerobic | Aerobic | Aerobic | Aerobic |
| DNA extraction method | Qiagen DNeasy | Qiagen DNeasy | Qiagen DNeasy | Qiagen DNeasy | Qiagen DNeasy | Qiagen DNeasy | Modified |
| Sequencing facility | SeqCenter, | SeqCenter, | SeqCenter, | SeqCenter, | Inhouse ONT | Inhouse ONT | Inhouse ONT |
| Sequencing technology | Illumina NextSeq | Illumina NextSeq | Illumina NextSeq | Illumina NextSeq | ONT | ONT | ONT |
| ONT Flow Cell | FLO-MIN106 | FLO-MIN106 | FLO-MIN106 | FLO-MIN106 | FLO-MIN106 | FLO-MIN106 | FLO-MIN114 |
| ONT Library Preparation Kit | SQK-LSK109, | SQK-LSK109, | SQK-LSK109, | SQK-LSK109, | SQK-LSK109, | SQK-LSK109, | SQK-LSK114 |
| Base caller | Guppy version 6.5.7 | Guppy version 6.5.7 | Guppy version 6.5.7 | Guppy version 6.5.7 | Guppy version 6.5.7 | Guppy version 6.5.7 | Dorado version 0.7.3 |
| BioSample |
|
|
|
|
|
|
|
| SRA accession no. (ONT) |
|
|
|
|
|
|
|
| SRA accession no. (Illumina) |
|
|
|
| N/A | N/A | N/A |
| WGS accession no. |
|
|
|
|
|
|
|
| Assembly year | 2022 | 2022 | 2022 | 2022 | 2022 | 2021 | 2024 |
| Assembly type | |||||||
| Reads Illumina (read pairs) | 3,625,687 | 2,957,812 | 3,160,759 | 3,519,480 | N/A | N/A | N/A |
| Number of raw ONT reads (bp) | 149,100,000 | 220,000,000 | 246,700,000 | 220,000,000 | 2,100,000,000 | 5,645,342,330 | 6,011,438,671 |
| Sequencing | 12,636 | 17,948 | 10,557 | 24,364 | 43,073 | 16,001 | 4,038 |
| Filtered reads ONT (bp) | 149,072,979 | 488,529,099 | 246,663,274 | 554,542,900 | 240,016,725 | 216,041,805 | 236,625,424 |
| Genome size (bp) | 5,298,867 | 5,483,600 | 5,513,759 | 5,515,304 | 5,432,770 | 5,513,656 | 5,346,046 |
| Coverage | 184.9 | 151.1 | 160.9 | 179.0 | 44.2 | 39.2 | 44.3 |
| Assembly | 1,386,635 | 3,507,490 | 3,537,649 | 3,537,649 | 3,537,633 | 3,537,632 | 3,533,349 |
| GC content (%) | 63.57 | 63.61 | 63.58 | 63.58 | 62.63 | 63.58 | 63.63 |
| No. of contigs | 7 | 6 | 5 | 6 | 4 | 5 | 6 |
| Contamination (%) | 0.47 | 0.47 | 0.47 | 0.47 | 0.47 | 0.47 | 0 |
| Completeness (%) | 92.6 | 99.94 | 99.94 | 99.94 | 99.06 | 98.92 | 99.94 |
| Most closely related genome | |||||||
| 23S rRNA genes count | 21 | 21 | 21 | 21 | 20 | 21 | 21 |
| 16S rRNA genes count | 17 | 19 | 19 | 19 | 19 | 16 | 19 |
| 5S rRNA genes count | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| trRNA count | 64 | 78 | 78 | 78 | 65 | 86 | 78 |
- —National Science Foundationhttp://dx.doi.org/10.13039/501100008982
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Plant Disease Resistance and Genetics · Legume Nitrogen Fixing Symbiosis
ANNOUNCEMENT
Ralstonia are free-living soil bacteria known for pathogenicity (1) and terrestrial carbon cycling (2). To better understand Ralstonia and how they respond to climate stress, we sequenced and annotated seven Ralstonia isolated from experimentally heated and control soils in Prospect Hill at the Harvard Forest Long-Term Warming Experiment in Petersham, MA, USA (42.54°N, 72.18°W) (3). Soils were collected in April and June 2014 with 1/2-inch tubular soil corers to a depth of 10 cm, with the organic and mineral horizons split by eye (4). The respective isolation media and atmosphere are described in Table 1. For whole-genome sequencing, isolates were streaked from glycerol stocks onto 10% tryptic soy agar or Reasoner’s 2 agar (pH 6). Single colonies were picked to grow in the same liquid media as the agar on a shaking incubator at 30°C until the stationary phase. Genomic DNA extraction was performed on the pellets using the methods listed in Table 1. An additional RNase A treatment was performed for GP95 according to reference 5. We validated the average DNA fragment size to be 30–50 kb on 0.5% agarose gel with Quick-Load 1 kb Extend DNA ladder (New England Biolabs, Ipswich, MA, USA). For long-read sequencing using Oxford Nanopore Technologies (Oxford, UK), we followed the ligation sequencing library protocol compatible with the consumables listed in Table 1. No size selection or shearing was performed prior to library preparation. Six to seven isolates were multiplexed for a single sequencing run for 48–72 hours, with barcoded allocation turned on in the MinKNOW interface (Oxford Nanopore Technologies), except GP95, which was sequenced on a single flow cell. Fast5 or Pod5 files were base called using the high accuracy model with base callers listed in Table 1. The reads were subsampled to a minimum of 40× target coverage and controlled for quality and length using “–min_length 1000 –min_mean_q 85” on Filtlong version 0.2.1 (6). We used Flye version 2.9.1 (7) to generate the de novo assembly, Racon version 1.4.3 (8) and Minimap2 version 2.24 (9) to generate the consensus sequence, and Medaka version 1.7.2 (Oxford Nanopore Technologies) to generate the final polished genome draft sequence.
For de novo hybrid assembled isolates, short read libraries were constructed using Illumina (San Diego, CA, USA) DNA Prep kit and IDT (Coralville, IA, USA) 10 bp UDI indices and sequenced on NextSeq 2000 (Illumina) to produce 2 × 151 bp reads. Demultiplexing, quality control, and adapter trimming were performed with bcl-convert version 3.9.3 (Illumina). De novo hybrid assemblies were generated using Unicycler version 0.5.0 (14). Default parameters were used except where otherwise noted.
Genomes were identified as Ralstonia with average nucleotide identity by NCBI (15). The genome sizes for these isolates range from 5,298,867 to 5,515,304 bp, with GC content ranging from 62.63% to 63.63%. Genome contamination and completeness were estimated with CheckM version 1.0.18 (16). Genomes were not circularized and were annotated with RASTtk version 1.073 (17).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Ryan MP, Adley CC. 2014. Ralstonia spp.: emerging global opportunistic pathogens. Eur J Clin Microbiol Infect Dis 33:291–304. doi:10.1007/s 10096-013-1975-924057141 · doi ↗ · pubmed ↗
- 2De Angelis KM, Pold G, Topçuoğlu BD, van Diepen LTA, Varney RM, Blanchard JL, Melillo J, Frey SD. 2015. Long-term forest soil warming alters microbial communities in temperate forest soils. Front Microbiol 6:104. doi:10.3389/fmicb.2015.0010425762989 PMC 4327730 · doi ↗ · pubmed ↗
- 3Peterjohn WT, Melillo JM, Steudler PA, Newkirk KM, Bowles FP, Aber JD. 1994. Responses of trace gas fluxes and N availability to experimentally elevated soil temperatures. Ecol Appl 4:617–625. doi:10.2307/1941962 · doi ↗
- 4Pold G, Billings AF, Blanchard JL, Burkhardt DB, Frey SD, Melillo JM, Schnabel J, van Diepen LTA, De Angelis KM. 2016. Long-term warming alters carbohydrate degradation potential in temperate forest soils. Appl Environ Microbiol 82:6518–6530. doi:10.1128/AEM.02012-1627590813 PMC 5086546 · doi ↗ · pubmed ↗
- 5Yoshinaga Y, Dalin E. 2016. R Nase a cleanup of DNA samples. In Joint genome institute department of energy
- 6Wick R. 2021. Filtlong: quality filtering tool for long reads. Github. Available from: https://github.com/rrwick/Filtlong. Retrieved 11 Mar 2025.
- 7Kolmogorov M, Yuan J, Lin Y, Pevzner PA. 2019. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37:540–546. doi:10.1038/s 41587-019-0072-830936562 · doi ↗ · pubmed ↗
- 8Vaser R, Sović I, Nagarajan N, Šikić M. 2017. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27:737–746. doi:10.1101/gr.214270.11628100585 PMC 5411768 · doi ↗ · pubmed ↗
