Complete genome sequences of the Paenibacillus kyungheensis KACC 18744T, Sphingomonas naphthae KACC 18716T, and Novosphingobium humi KACC 19094T

Hyorim Choi; Seunghwan Kim; Miyoung Won; Yunhee Choi; Yonghoon Lee; Yiseul Kim; Jun Heo

PMC · DOI:10.1128/mra.00753-24·January 27, 2025

Complete genome sequences of the Paenibacillus kyungheensis KACC 18744T, Sphingomonas naphthae KACC 18716T, and Novosphingobium humi KACC 19094T

Hyorim Choi, Seunghwan Kim, Miyoung Won, Yunhee Choi, Yonghoon Lee, Yiseul Kim, Jun Heo

PDF

Open Access

TL;DR

This paper presents the complete genome sequences of three bacterial species found in Korea to study their genomic diversity.

Contribution

The novelty lies in providing complete genome sequences of three Korean bacterial type strains for genomic diversity analysis.

Findings

01

The whole genome sequence of Paenibacillus kyungheensis KACC 18744T was determined.

02

Genome sequences of Sphingomonas naphthae KACC 18716T and Novosphingobium humi KACC 19094T were reported.

03

The study contributes to understanding the genomic diversity of Korean bacterial type strains.

Abstract

We report the whole genome sequences of Paenibacillus kyungheensis KACC 18744T, Sphingomonas naphthae KACC 18716T, and Novosphingobium humi KACC 19094T, to investigate the genomic diversity of bacterial type strains distributed in Korea.

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species3

Novosphingobium humi(species)Paenibacillus kyungheensis(species)Sphingomonas naphthae(species)

Chemicals1

Reasoner's 2A medium

Tables1

TABLE 1. Sequencing, assembly, and annotation results for the sequenced strains

Strain	KACC 18744^T	KACC 18716^T	KACC 19094^T
Species	Paenibacillus kyungheensis	Sphingomonas naphthae	Novosphingobium humi
BioProject accession no.	PRJNA992936	PRJNA930972	PRJNA930975
BioSample accession no.	SAMN36377341	SAMN33038042	SAMN33039045
GenBank assembly accession no.	GCF_028606985	GCF_028607085	GCF_028607105
GenBank accession no.	CP117416	CP117411-CP117415	CP117417-CP117420
SRA accession no.
Illumina	SRR25238406	SRR23952290	SRR24694047
Pacbio	SRR25238405	SRR23952291	SRR24694046
HiFi reads
Total length (bp)	1,227,207,177	1,714,875,056	1,475,901,199
Total no. of reads	136,392	190,527	171,360
N₅₀ (bp)	9,721	9,666	9,271
Mean quality	Q35	Q32	Q33
Illumina
Total length (bp)	3,355,346,840	3,219,202,522	3,158,184,630
Paired length (bp)	2,623,250,643	2,342,773,538	2,402,495,342
Filtered no. of reads	17,380,330	15,521,426	15,915,742
Q20 (%)	99.36	99.37	99.35
De novo assembly
No. of contigs	1 (circular)	5 (circular)	4 (circular)
Total length (bp)	5,258,865	4,309,746	4,890,308
Corrected total length	5,260,882	4,310,851	4,890,578
G+C content (%)	39.3	67.3	63.6
Sequencing depth	233.0×	397.2×	301.4×
BUCO completion (%)	99.19	100	99.19
Chromosome length (bp)	5,260,882	3,919,827	3,439,207
Annotation results
No. of genes	4,586	4,199	4,435
No. of CDS	4,480	4,143	4,368
No. of RNA genes	106	56	67

Keywords

PaenibacillusNovosphingobiumSphingomonasgenomestype strain

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Phylogenetic Studies · Probiotics and Fermented Foods · Bacteriophages and microbial interactions

Full text

ANNOUNCEMENT

The genomic information of type strains plays a crucial role in bacterial phylogenetics, functional gene analysis, and comparative genomic analyses. Despite its importance, genome analysis was not mandatory for the description of novel bacterial species prior to 2018 (1). Many type strains reported before this time still lack sufficient genomic data. To address this gap, we conducted genome sequencing of type strains preserved at the Korean Agricultural Culture Collection (KACC). Specifically, our study focused on three type strains (Paenibacillus kyungheensis KACC 18744^T^, Sphingomonas naphthae KACC 18716^T^, and Novosphingobium humi KACC 19094^T^) isolated and reported in Korea between 2015 and 2017 (2 –4), but genomic data for these strains remained unavailable until now.

These strains were cultured on Reasoner’s 2A medium (BD Difco, NJ, USA) with pH 6.0 at 28°C for 3 days under aerobic condition. According to the manufacturer’s protocol, the Qiagen MagAttract HMW DNA kit (Qiagen, Hilden, Germany) was used for genomic DNA extraction. The DNA products generated from the previous procedure were used for genome sequence analysis on an Illumina MiSeq (Illumina, CA, USA) and PacBio Sequel IIe (Pacific Biosciences, CA, USA). Genomic DNA libraries were created using the TruSeq Nano DNA High Throughput Library Prep kit (Illumina, CA, USA). A total of 10 µL libraries were prepared as 7–12kb size templates for the PacBio SMRTbell prep kit 3.0. Then, they were analyzed using the Sequel II Bind Kit 3.2 and Int Ctrl 3.2. Sequencing was carried out using Sequel II Sequencing Kit 2.0 and SMRT cell 8M trays. HIFI reads were obtained from the PacBio Sequel IIe system and assembled using the microbial assembly application in SMRT link 11.0.0.146107 software with default parameters, based on the Hierarchical Genome Assembly Process (5). HIFI reads were generated with quality value 20 or 99% predicted accuracy.

The Illumina raw reads of which 90% of the bases had a phred score of 30 or higher were filtered, and adapter trimming was performed using Trimmomatic 0.38 (6). Then, the assembly, initially constructed with long-read data, was revised through Pilon v1.21 to correct errors and enhance precision (7). Successfully circularized were confirmed by microbial genome analysis application in the SMRTlink 11.0.0.146107 (8), and completeness were checked by BUSCO v5 (9). Gene prediction and annotation were conducted by the NCBI Prokaryotic Genome Annotation Pipeline v6.7 (PGAP) (10). All tools were executed with default parameters unless stated otherwise.

The genome of P. kyungheensis KACC 18744^T^ consists of a single circular chromosome (5,260,882 bp with GC content of 39.5%). The genome of S. naphthae KACC 18716^T^ has a single circular chromosome (4,310,851 bp with GC content of 67.5%) and four other contigs. The genome of N. humi KACC 19094^T^ has a single circular chromosome (4,890,578 bp with GC content of 63.5%) and three other contigs. Additional details and annotation results for the three genomes are shown in Table 1.

Bibliography10

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Chun J, Oren A, Ventosa A, Christensen H, Arahal DR, da Costa MS, Rooney AP, Yi H, Xu X-W, De Meyer S, Trujillo ME. 2018. Proposed minimal standards for the use of genome data for the taxonomy of prokaryotes. Int J Syst Evol Microbiol 68:461–466. doi:10.1099/ijsem.0.00251629292687 · doi ↗ · pubmed ↗
2Siddiqi MZ, Siddiqi MH, Im WT, Kim YJ, Yang DC. 2015. Paenibacillus kyungheensis sp. nov., isolated from flowers of magnolia. Int J Syst Evol Microbiol 65:3959–3964. doi:10.1099/ijsem.0.00052126268929 · doi ↗ · pubmed ↗
3Chaudhary DK, Kim J. 2016. Sphingomonas naphthae sp. nov., isolated from oil-contaminated soil. Int J Syst Evol Microbiol 66:4621–4627. doi:10.1099/ijsem.0.00140027506439 · doi ↗ · pubmed ↗
4Hyeon JW, Kim K, Son AR, Choi E, Lee SK, Jeon CO. 2017. Novosphingobium humi sp. nov., isolated from soil of a military shooting range. Int J Syst Evol Microbiol 67:3083–3088. doi:10.1099/ijsem.0.00208928829033 · doi ↗ · pubmed ↗
5Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, Clum A, Copeland A, Huddleston J, Eichler EE, Turner SW, Korlach J. 2013. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods 10:563–569. doi:10.1038/nmeth.247423644548 · doi ↗ · pubmed ↗
6Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi:10.1093/bioinformatics/btu 17024695404 PMC 4103590 · doi ↗ · pubmed ↗
7Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. P Lo S One 9:e 112963. doi:10.1371/journal.pone.011296325409509 PMC 4237348 · doi ↗ · pubmed ↗
8Hunt M, Silva ND, Otto TD, Parkhill J, Keane JA, Harris SR. 2015. Circlator: automated circularization of genome assemblies using long sequencing reads. Genome Biol 16:294. doi:10.1186/s 13059-015-0849-026714481 PMC 4699355 · doi ↗ · pubmed ↗