Complete genome sequence of Neobacillus strain fa239, isolated from soil in Mie, Japan

Ayumi Tanimura

PMC · DOI:10.1128/mra.00837-25·September 24, 2025

Complete genome sequence of Neobacillus strain fa239, isolated from soil in Mie, Japan

Ayumi Tanimura

PDF

Open Access

TL;DR

Scientists sequenced the complete genome of a Neobacillus strain found in soil in Japan.

Contribution

The complete genome sequence of Neobacillus strain fa239 is newly reported.

Findings

01

The genome consists of two circular contigs totaling 6,133,609 bp.

02

The GC content of the genome is 39.3%.

Abstract

This report presents the complete genome sequence of Neobacillus strain fa239, isolated from soil in Mie, Japan. The genome consists of two circular contigs totaling 6,133,609 bp, with a GC content of 39.3%. Sequencing was performed on the PacBio Revio platform.

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Chemicals4

SAMD01574884 saline agar carbon

Funding1

—Japan Society for the Promotion of Sciencehttp://dx.doi.org/10.13039/501100001691

Keywords

Bacillaceaecomplete genome sequence

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Phylogenetic Studies · Bacteriophages and microbial interactions · Probiotics and Fermented Foods

Full text

ANNOUNCEMENT

Strain fa239 was isolated from a surface soil sample (0–5 cm depth) collected at Mitakien Higashi Park in Komono Town, Mie Prefecture, Japan (35.018539°N, 136.514897°E), a non-allophanic Andosol site. The sample was suspended in sterile saline, serially diluted, and plated on an in-house agar medium containing inorganic salts, trace elements, and a C1 carbon source. Plates were incubated aerobically at 25°C for 3 days. A single colony was isolated and purified by repeated streaking on YM agar (1).

Taxonomic identification was performed by extracting the 16S rRNA gene sequence from the complete genome assembly and comparing it to type strain sequences using BLASTn against the NCBI nucleotide database (2). The sequence showed 98.45% identity to that of Neobacillus driksii (GenBank accession number PP849388.1).

For genome sequencing, strain fa239 was cultured in YM broth at 25°C overnight. Genomic DNA was extracted using the Genomic-tip 20G Kit (Qiagen). No intentional DNA shearing was performed. Short fragments were removed using the Short Read Eliminator XS Kit (PacBio). Library preparation was performed with the SMRTbell Prep Kit 3.0 and the SMRTbell gDNA Sample Amplification Kit (PacBio) according to the manufacturer’s protocols. Sequencing was carried out on a PacBio Revio platform using the Revio Polymerase Kit. A total of 46,623 reads were obtained (mean 5.3 kb, N50 5.4 kb; 247.8 Mb). Quality filtering with Filtlong v0.2.1 (3) removed reads shorter than 1,000 bp and the lowest quality, 10%, yielding ≥95.5% of bases with Q ≥ 30. Genome coverage was estimated at 40.4×.

HiFi reads were generated using SMRT Link v11.0 (4). Adapter trimming was performed with lima v2.12.0 (PacBio) (5), and duplicate reads were removed using pbmarkdup v1.0.3 (PacBio) (6). No additional error correction was performed, as PacBio HiFi reads are already highly accurate. Genome assembly was performed using Flye v2.9.3-b1797 (7), yielding two circular contigs totaling 6,133,609  bp (N50 5,706,330 bp; GC 39.3%). These were a chromosome (5,706,330 bp) and a plasmid (427,279 bp). Circularization of both contigs was verified manually by trimming overlapping ends to produce a seamless circular genome. For the deposited record, assembler-derived coordinates were retained; thus, the chromosomal dnaA gene is at 5,497,224–5,498,570 rather than coordinate 1. The smaller circular contig was deposited unrotated because no unambiguous plasmid replication initiation gene (e.g., rep) was identified. Assembly graphs were visualized with Bandage v0.8.1 (8), and genome completeness and contamination were assessed with CheckM2 v1.2.2 (9), which indicated 100.0% completeness and 9.16% contamination, likely reflecting closely related strains or reference database limitations.

Gene prediction used Prokka v1.14.6 (10), followed by functional annotation with DIAMOND v2.1.6 (11) against the KEGG and GO databases (12, 13). Annotation was also performed using DFAST v1.2.18 (14). The complete genome comprises 5,869 predicted coding sequences, 54 rRNA genes, and 160 tRNA genes. A single plasmid replicon was identified. This complete genome sequence will support taxonomic and phylogenetic studies of Neobacillus-related environmental bacteria. Default parameters were used unless otherwise noted.

Bibliography14

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Atlas RM. 2010. Handbook of microbiological media. CRC Press, Boca Raton, FL.
2Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215:403–410. doi:10.1016/S 0022-2836(05)80360-22231712 · doi ↗ · pubmed ↗
3Wick RR. Filtlong: quality filtering tool for long reads. Available from: https://github.com/rrwick/Filtlong. Retrieved 2020 Aug August 2025. Accessed , 2020 Aug August 2025
4Wenger AM, Peluso P, Rowell WJ, Chang P-C, Hall RJ, Concepcion GT, Ebler J, Fungtammasan A, Kolesnikov A, Olson ND, et al.. 2019. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol 37:1155–1162. doi:10.1038/s 41587-019-0217-931406327 PMC 6776680 · doi ↗ · pubmed ↗
5Pacific Biosciences. Lima: Pac Bio barcode demultiplexer and adapter removal tool. Available from: https://github.com/Pacific Biosciences/barcoding. Retrieved 2020 Aug August 2025. Accessed , 2020 Aug August 2025
6Pacific Biosciences. Pbmarkdup: Pac Bio duplicate marking tool. Available from: https://github.com/Pacific Biosciences/pbmarkdup. Retrieved 2020 Aug August 2025. Accessed , 2020 Aug August 2025
7Kolmogorov M, Yuan J, Lin Y, Pevzner PA. 2019. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37:540–546. doi:10.1038/s 41587-019-0072-830936562 · doi ↗ · pubmed ↗
8Wick RR, Schultz MB, Zobel J, Holt KE. 2015. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics 31:3350–3352. doi:10.1093/bioinformatics/btv 38326099265 PMC 4595904 · doi ↗ · pubmed ↗