Draft genome sequence of Streptomyces sp. strain M41 isolated from soil in a conserved region of Sipadan Island, Sabah, Malaysia
Mardani Abdul Halim, Nur Ariffah Waly, Gerald Jetony, Ken Kartina Khamis, Colin Robinson, Nurul Akmar Hussin, Clemente Michael Wong Vui Ling, Sazmal Effendi Arshad, Zarina Amin

TL;DR
This paper reports the draft genome sequence of a Streptomyces strain from Sipadan Island, Malaysia, highlighting its potential for producing secondary metabolites.
Contribution
The study provides a new draft genome from a conserved region, offering insights into its secondary metabolite biosynthesis potential.
Findings
The draft genome has 70 scaffolds and a total length of 8,039,509 bp.
It has a GC content of 70.9% and 7,139 putative genes.
Genes related to secondary metabolite biosynthesis were identified.
Abstract
We present a draft genome of Streptomyces sp. isolated from soil in a conserved region of Sipadan Island, Sabah, Malaysia, and sequenced using the Illumina NovaSeq 6000. The draft genome consists of 70 scaffolds with a total length of 8,039,509 bp and a GC content of 70.9%. It contains 7,139 putative genes, including genes predicted to encode secondary metabolite biosynthesis.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
| Type | From | To | Most similar known cluster | Similarity (%) | |
|---|---|---|---|---|---|
|
| 103,345 | 147,724 |
| NRP | 66 |
|
| 477,699 | 504,565 |
| Terpene | 92 |
| Terpene, butyrolactone | 84,658 | 106,817 |
| Other | 100 |
|
| 1 | 37,426 |
| NRP | 100 |
|
| 85,747 | 147,159 |
| NRP | 83 |
| RiPP-like, lanthipeptide-class-iii | 143,112 | 171,117 |
| RiPP: lanthipeptide | 100 |
|
| 14,811 | 25,305 |
| Other | 100 |
|
| 115,793 | 127,565 |
| Other | 83 |
|
| 114,686 | 140,092 |
| Terpene | 63 |
| NRPS, T2PKS, other, oligosaccharide | 1 | 95,345 |
| Polyketide: type II+saccharide: hybrid/tailoring | 100 |
|
| 19,595 | 40,518 |
| Terpene | 100 |
|
| 13,619 | 24,023 |
| Other | 100 |
|
| 3,799 | 44,986 |
| Other | 100 |
| Melanin, terpene | 51,094 | 72,798 |
| Other | 57 |
|
| 2,935 | 63,190 |
| Polyketide | 75 |
|
| 2,140 | 23,159 |
| Other | 100 |
- —SABAH BIODIVERSITY CENTRE MALAYSIA
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Microbial Natural Products and Biosynthesis · Bacteriophages and microbial interactions
ANNOUNCEMENT
Streptomyces species are well known for their antimicrobial production and the synthesis of secondary metabolites due to specific environmental adaptation mechanisms (1, 2). Streptomyces sp. strain M41 was isolated from Sipadan Island, located in the Semporna District of Sabah, Malaysia, at latitude 4°7'0.33" N and longitude 118°37'40.972" E, using Actinomycete Isolation Agar (HiMedia, India). Briefly, 1 g of soil samples was weighed into a tube, and 9 mL of sterile distilled water was added to the tube to create a 10-fold dilution. The dilutions were spread on Actinomycete Isolation Agar without filtration and incubated at 25°C for 3 days under aerobic conditions. A single colony was picked from the growth and subsequently grown in actinomycete broth at 25°C for 3 days under aerobic conditions with continuous shaking prior to genomic DNA extraction. Genomic DNA was extracted using the Qiagen Genomic DNA Buffer Set Kit (Qiagen, Germany) following the manufacturer’s protocol. Identification was performed using 16S rRNA universal primers 27F 5′-AGAGTTTGATCMTGGCTCAG-3′ and 1492R 5′-TACGGYTACCTTGTTACGACTT-3′, and sequencing was done using the Sanger sequencing method. The output FASTA file was analyzed using BLAST (3) version 2.13.0 (https://blast.ncbi.nlm.nih.gov/Blast.cgi) against the core nucleotide database (core nt). Our analysis showed that Streptomyces sp. strain M41 was closely related to Streptomyces sp. strain G2R-M4-3-4 with 99.87% identity to the GenBank accession number PQ119703.1. For whole-genome sequencing, genomic DNA was enzymatically sheared into fragments of approximately 300 bp using the Illumina DNA Prep kit (Illumina, USA), followed by library preparation according to the manufacturer’s protocol. The library was then sequenced using the Illumina NovaSeq 6000, a 150 bp paired-end run with approximately 100× depth coverage, generating approximately 13.2 million reads. Unless otherwise noted, default parameters were used for all downstream bioinformatics analyses. The raw reads were adapter-trimmed using Trimmomatic v0.38 (4). The cleaned reads were assembled and scaffolded using SPAdes v3.15.3 (5). Subsequently, the assembled genome was annotated using the NCBI Prokaryotic Genome Annotation Pipeline v6.8 (6), with the option “minimum contig size (--mincontiglen)” set to 500 and an e-value cut-off of 1e-06. To screen for antibiotic-resistant genes, ABRIcate v1.0.1 was used (7). Genome-wide identification of secondary metabolite biosynthesis gene clusters was performed using antiSMASH v6.1.1 (8). The draft genome consisted of 70 scaffolds with a total size of 8,039,509 bp, an N50 of 183,107, and 70.9% GC. From the annotation, 6,928 coding sequence (CDS) were identified as protein-coding genes. ABRIcate analysis predicted two genes that might function in synthesizing antibiotics, namely oleandomycin and erythromycin. Genome-wide screening predicted 16 locations of possible secondary metabolite biosynthesis gene clusters, namely nonribosomal peptide synthetases (NRPS), terpene, RiPP-like lantipeptide, melanin, siderophore, ectoine, T3PKS, and T2PKS. The details of this analysis are summarized in Table 1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Kim JH, Lee N, Hwang S, Kim W, Lee Y, Cho S, Palsson BO, Cho BK. 2021. Discovery of novel secondary metabolites encoded in actinomycete genomes through coculture. J Ind Microbiol Biotechnol 48:kuaa 001. doi:10.1093/jimb/kuaa 00133825906 PMC 9113425 · doi ↗ · pubmed ↗
- 2Selim MSM, Abdelhamid SA, Mohamed SS. 2021. Secondary metabolites and biodiversity of actinomycetes. J Genet Eng Biotechnol 19:72. doi:10.1186/s 43141-021-00156-933982192 PMC 8116480 · doi ↗ · pubmed ↗
- 3Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, Mc Ginnis S, Madden TL. 2008. NCBI BLAST: a better web interface. Nucleic Acids Res 36:W 5–9. doi:10.1093/nar/gkn 20118440982 PMC 2447716 · doi ↗ · pubmed ↗
- 4Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi:10.1093/bioinformatics/btu 17024695404 PMC 4103590 · doi ↗ · pubmed ↗
- 5Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SP Ades: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi:10.1089/cmb.2012.002122506599 PMC 3342519 · doi ↗ · pubmed ↗
- 6Li W, O’Neill KR, Haft DH, Di Cuccio M, Chetvernin V, Badretdin A, Coulouris G, Chitsaz F, Derbyshire MK, Durkin AS, Gonzales NR, Gwadz M, Lanczycki CJ, Song JS, Thanki N, Wang J, Yamashita RA, Yang M, Zheng C, Marchler-Bauer A, Thibaud-Nissen F. 2021. Ref Seq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation. Nucleic Acids Res 49:D 1020–D 1028. doi:10.1093/nar/gkaa 110533270901 PMC 7779008 · doi ↗ · pubmed ↗
- 7Seemann T. 2016 AB Ricate: Mass Screening of Contigs for Antiobiotic Resistance Genes. Available from: https://github.com/tseemann/abricate
- 8Blin K, Shaw S, Kloosterman AM, Charlop-Powers Z, van Wezel GP, Medema MH, Weber T. 2021. anti SMASH 6.0: improving cluster detection and comparison capabilities. Nucleic Acids Res 49:W 29–W 35. doi:10.1093/nar/gkab 33533978755 PMC 8262755 · doi ↗ · pubmed ↗
