Metagenome-assembled genome sequence of Spiroplasma phoeniceum, assembled from the hindgut of Locusta migratoria, a migration grasshopper species
Jaeha Kim, Takumi Murakami, Atsushi Toyoda, Hiroshi Mori

TL;DR
Scientists assembled the genome of Spiroplasma phoeniceum from the gut of a grasshopper, revealing a 1.06 million base pair genome with 26.3% GC content.
Contribution
The study provides a metagenome-assembled genome sequence of S. phoeniceum from a grasshopper host for the first time.
Findings
The MAG sequence of S. phoeniceum is 1,059,205 bp long and consists of 91 contigs.
The genome has a GC content of 26.3%.
The genome was assembled from the hindgut of Locusta migratoria.
Abstract
Spiroplasma phoeniceum is a plant pathogen and a mesophilic microaerophile. Here, we report the metagenome-assembled genome (MAG) sequence of S. phoeniceum binned from hindgut contents of the wild-type male Locusta migratoria, a grasshopper species. The MAG sequence comprises 1,059,205 bp in 91 contigs with a 26.3% of GC content.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
| Lm2022 | P40 | PmF2021_Spiro | PmM2021_Spiro | |
|---|---|---|---|---|
| Genome ID |
|
|
|
|
| Genome size (bp) | 1,059,205 | 1,908,276 | 2,171,854 | 2,179,570 |
| No. of chromosomes or contigs | 91 | 1 | 1 | 1 |
| No. of plasmids | N.D. | 3 | N.D. | N.D. |
| GC content (%) | 26.3 | 25.4 | 24.4 | 24.4 |
| No. of CDSs | 1,106 | 2,619 | 3,040 | 3,052 |
| No. of rRNA genes | 1 | 3 | 3 | 3 |
| No. of tRNA genes | 31 | 43 | 31 | 31 |
| No. of unique CDSs | 145 | 1064 | 0 | 2 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEntomopathogenic Microorganisms in Pest Control · Phytoplasmas and Hemiptera pathogens · Insect symbiosis and bacterial influences
ANNOUNCEMENT
Spiroplasma phoeniceum is a plant pathogen and mesophilic microaerophile. It was isolated in culture from plants of Catharanthus roseus in Syria in 1987 and was reported as S. phoeniceum P40 (1). The complete genome of S. phoeniceum P40 was reported in 2019 (2). Here, we introduce a metagenome-assembled genome (MAG) sequence of Spiroplasma phoeniceum binned from hindgut contents of the wild-type male Locusta migratoria, a grasshopper species.
A wild-type male L. migratoria was gathered from Kisegawa area (lat. 35.10325, long. 138.88771) in Japan in July 2022 and immediately was moved to a −30°C refrigerator. After that, the hindgut of L. migratoria was carefully extracted by the dissection on ice. For the DNA isolation, QIAamp Fast DNA Stool Mini Kit (Qiagen GmbH, Hilden, Germany) was used based on the standard DNA isolation protocol. For the DNA quantification, Qubit (Life Technologies, CA, USA.) was used. The quality of the DNA-eluted solution was checked with a Bioanalyzer (Agilent 2100 Bioanalyzer, Agilent Technologies, CA, USA.). Sequencing libraries were generated using the Illumina Nextera Flex Kit (Illumina, San Diego, CA, USA.). The libraries were sequenced on the Illumina NovaSeq 6000 platform, and 150-bp paired-end reads were generated. As a result, 33,715,814 read pairs were obtained. The read quality filtering (i.e., adaptor sequence removal and low-quality sequence trimming) was performed using fastp v.0.23.2 (3). For removing host DNA sequences, L. migratoria genome sequence (NCBI GenBank ID: GCA_026315105.1) was referred to as the reference for BWA-MEM v.0.7.17 (4) read mapping with the default parameters. MEGAHIT v.1.2.9 (5) was used for the assembly from reads to contigs with the default parameters. Bacterial contig sequences were identified, and unassigned sequences were excluded using Kraken2 v.2.0.8 (6) based on the standard Kraken 2 databased v.Sep.2022. For the contig binning, the read mapping result of BWA-MEM v.0.7.17 (4) against bacterial contigs was used. Metagenome-assembled genomes were constructed by using MetaWRAP v.1.3.2 (7). The taxonomic name was inferred by GTDB-Tk v.2.1.1 (8). The MAG was reassembled by mapping to the reference genome S. phoeniceum P40 (GCF_003339775.1) using RagTag v.2.1.0 (9). Protein-coding genes on the MAG were annotated using DFAST v.1.2.0 (10). To compare genomes with the same annotation methods, genome sequences of S. phoeniceum P40 (GCF_003339775.1) and S. phoeniceum MAG PmF2021_Spiro (CP110829.1) and PmM2021_Spiro (CP110830.1) were obtained and reannotated using DFAST. The protein-coding genes of the four strains were clustered using OrthoFinder v.2.5.2 (11). Default parameters were used except where otherwise noted.
As a result, the 1,059,205-bp MAG (Lm2022) comprising 91 contigs (N50 = 834,464 bp) with 26.3% of GC content, 93.58% completeness, and 0.0% contamination was predicted. The average nucleotide identity (ANI) between S. phoeniceum P40 and Lm2022 is 96.52%, calculated using the OrthoANIu tool (12). The genome differences of four S. phoeniceum strains were described in Table 1. The MAG sequence provides important information about the genomic diversity of this species.
**TABLE 1: The genome difference of four S. phoeniceum strains
a
, b , c**
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Saillard C , Vignault JC , Bove JM , Raie A , Tully JG , Williamson DL , Fos A , Garnier M , Gadeau A , Carle P , Whitcomb RF . 1987. Spiroplasma phoeniceum sp. nov., a new plant-pathogenic species from Syria. Intl J Syst Bacteriol 37:106–115. doi:10.1099/00207713-37-2-106 · doi ↗
- 2Davis RE , Shao J , Zhao Y , Wei W , Bottner-Parker K , Silver A , Stump Z , Gasparich GE , Donofrio N . 2019. Complete genome sequence of Spiroplasma phoeniceum strain P 40T, a plant pathogen isolated from diseased plants of Madagascar Periwinkle [Catharanthus roseus (L.) G. Don]. Microbiol Resour Announc 8:e 01612-18. doi:10.1128/MRA.01612-18 30938707 PMC 6430324 · doi ↗ · pubmed ↗
- 3Chen S , Zhou Y , Chen Y , Gu J . 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i 884–i 890. doi:10.1093/bioinformatics/bty 560 30423086 PMC 6129281 · doi ↗ · pubmed ↗
- 4Li H , Durbin R . 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760. doi:10.1093/bioinformatics/btp 324 19451168 PMC 2705234 · doi ↗ · pubmed ↗
- 5Li D , Liu CM , Luo R , Sadakane K , Lam TW . 2015. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31:1674–1676. doi:10.1093/bioinformatics/btv 033 25609793 · doi ↗ · pubmed ↗
- 6Wood DE , Lu J , Langmead B . 2019. Improved metagenomic analysis with Kraken 2. Genome Biol 20:257. doi:10.1186/s 13059-019-1891-0 31779668 PMC 6883579 · doi ↗ · pubmed ↗
- 7Uritskiy GV , Di Ruggiero J , Taylor J . 2018. Meta WRAP-a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6:158. doi:10.1186/s 40168-018-0541-1 30219103 PMC 6138922 · doi ↗ · pubmed ↗
- 8Chaumeil PA , Mussig AJ , Hugenholtz P , Parks DH . 2019. GTDB-Tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics 36:1925–1927. doi:10.1093/bioinformatics/btz 848 31730192 PMC 7703759 · doi ↗ · pubmed ↗
