Complete genome sequence of Streptomyces sp. CY-1 isolated from the rhizosphere of soybean (Glycine max (L.) Merr.)
Ying Guan, Xinlan Mei, Bingcheng Cong, Shimei Wang

TL;DR
This paper presents the full genome sequence of a Streptomyces strain found in the roots of soybean plants.
Contribution
The complete genome sequence of Streptomyces sp. CY-1 is newly reported using PacBio and Illumina sequencing.
Findings
The genome is a linear chromosome of 11,477,595 bp in length.
The sequencing was performed using both PacBio and Illumina platforms.
Abstract
We report the complete genome sequence of Streptomyces sp. strain CY-1, which was isolated from the rhizosphere of soybean. The genome was sequenced using both PacBio and Illumina platforms. It comprises a linear chromosome of 11,477,595 bp in length.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
- —National Natural Science Foundation of Chinahttp://dx.doi.org/10.13039/501100001809
- —National Natural Science Foundation of Chinahttp://dx.doi.org/10.13039/501100001809
- —China Postdoctoral Science Foundationhttp://dx.doi.org/10.13039/501100002858
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPlant Disease Resistance and Genetics · Genomics and Phylogenetic Studies · Legume Nitrogen Fixing Symbiosis
ANNOUNCEMENT
Streptomyces species are renowned for producing secondary metabolites with applications in agriculture and medicine (1, 2). Rhizosphere soil closely adhering to soybean roots was collected from four plants in Anhui, China (31.5511°N, 118.4889°E) as described by Guan et al. (3). Ten grams of soil were suspended in 90 mL of sterilized water and shaken at 28°C, 180 rpm for 30 min. Serial dilutions (10^−4^ to 10^−6^) were plated (100 µL) onto Gause’s No. 1 agar medium (4) and incubated at 28°C for five days. After purification, strain CY-1 was selected for genome sequencing.
Spores of strain CY-1 grown for five days on Gause’s No. 1 agar were collected using 1 mL sterilized saline and inoculated into Luria-Bertani broth (5) for two days at 30°C. DNA was extracted using the PureLink Genomic DNA Kit (Thermo Fisher Scientific, Waltham, MA, USA) following the manufacturer’s instructions. Illumina libraries were prepared using the TruSeq Nano DNA Sample Prep Kit (Illumina, San Diego, CA, USA) following Illumina’s standard procedure. Paired-end libraries with insert sizes of ~400 bp were constructed and sequenced on the Illumina NovaSeq 6000 platform (2×150 bp) by Shanghai BIOZERON Co., Ltd. Sequencing generated 17,151,254 reads. Default software parameters were applied unless otherwise specified. Reads were trimmed using Trimmomatic v0.36 (6) with parameters “ILLUMINACLIP:adapters.fa:2:30:10 SLIDINGWINDOW:4:15 MINLEN:75”, yielding 14,367,126 clean reads.
The same DNA aliquots used for Illumina sequencing were also utilized for PacBio Sequel II sequencing. Genomic DNA was converted into SMRTbell libraries using the Express Template Prep Kit 2.0 (Pacific Biosciences, USA) according to the manufacturer’s protocol. Size selection was performed using BluePippin (Sage Science, USA) with the 0.75% DF Marker S1 High-Pass 6 kb–10 kb v3 run protocol and S1 marker. A size selection cutoff of 8,000 bp (BPstart value) was applied. A total of 194,923 raw PacBio reads were generated, comprising 1,536,746,277 bases, with an average read length of 7,884 bp, and an N_50_ of 9,465 bp.
Genome assembly was performed using both PacBio and Illumina reads with Unicycler v0.4.8 (--min_fasta_length 500 t 32 --mode normal) (7), and three subsequent rounds of polishing were performed with Pilon v1.21 (8) using Illumina reads. Annotation was performed by NCBI Prokaryotic Genome Annotation Pipeline v6.9 (9, 10).
The genome consists of a single linear contig of 11,477,595 bp, G+C 70.91% and 308× coverage. Genome completeness was 100%, and contamination was 0.48%, as evaluated using CheckM2 v1.0.2 (11) and BUSCO v5.3.2 (12) with the bacteria_odb10 reference database. Terminal inverted repeats of 14,978 bp were detected using BLAST v2.2.26, and mapping of PacBio reads to the genome sequence with minimap2 v2.28-r1209 confirmed coverage across these regions (13). Annotation identified 9,646 genes, comprising 9,234 coding sequences, 322 pseudogenes, 69 tRNAs, 3 ncRNAs, and 18 rRNAs. The 16S rRNA gene sequence of CY-1 exhibited 99.93% identity to Streptomyces yatensis NBRC 101000 (AB249962) in the EzBioCloud 16S rRNA database (14).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Alam K, Mazumder A, Sikdar S, Zhao Y-M, Hao J, Song C, Wang Y, Sarkar R, Islam S, Zhang Y, Li A. 2022. Streptomyces: the biofactory of secondary metabolites. Front Microbiol 13:968053. doi:10.3389/fmicb.2022.96805336246257 PMC 9558229 · doi ↗ · pubmed ↗
- 2Liu F, Wang N, Wang Y, Yu Z. 2024. The insecticidal activity of secondary metabolites produced by Streptomyces sp. SA 61 against Trialeurodes vaporariorum (Hemiptera: Aleyrodidae). Microorganisms 12:2031. doi:10.3390/microorganisms 1210203139458340 PMC 11509760 · doi ↗ · pubmed ↗
- 3Guan Y, Bak F, Hennessy RC, Horn Herms C, Elberg CL, Dresbøll DB, Winding A, Sapkota R, Nicolaisen MH. 2024. The potential of Pseudomonas fluorescens SBW 25 to produce viscosin enhances wheat root colonization and shapes root-associated microbial communities in a plant genotype-dependent manner in soil systems. m Sphere 9:e 0029424. doi:10.1128/msphere.00294-2438904362 PMC 11288004 · doi ↗ · pubmed ↗
- 4Cao P, Liu C, Sun P, Fu X, Wang S, Wu F, Wang X. 2016. An endophytic Streptomyces sp. strain DHV 3-2 from diseased root as a potential biocontrol agent against Verticillium dahliae and growth elicitor in tomato (Solanum lycopersicum). Antonie Van Leeuwenhoek 109:1573–1582. doi:10.1007/s 10482-016-0758-627582275 · doi ↗ · pubmed ↗
- 5Sezonov G, Joseleau-Petit D, D’Ari R. 2007. Escherichia coli physiology in Luria-Bertani broth. J Bacteriol 189:8746–8749. doi:10.1128/JB.01368-0717905994 PMC 2168924 · doi ↗ · pubmed ↗
- 6Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi:10.1093/bioinformatics/btu 17024695404 PMC 4103590 · doi ↗ · pubmed ↗
- 7Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. P Lo S Comput Biol 13:e 1005595. doi:10.1371/journal.pcbi.100559528594827 PMC 5481147 · doi ↗ · pubmed ↗
- 8Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. P Lo S One 9:e 112963. doi:10.1371/journal.pone.011296325409509 PMC 4237348 · doi ↗ · pubmed ↗
