Nuclear and mitochondrial genomes of the plum fruit moth Grapholita funebrana
Li-Jun Cao, Fangyuan Yang, Jin-Cui Chen, Shu-Jun Wei

TL;DR
This paper presents the nuclear and mitochondrial genomes of the plum fruit moth, a pest affecting stone fruits, providing genetic resources for understanding its biology and evolution.
Contribution
The study provides the first assembled nuclear and mitochondrial genomes of Grapholita funebrana using multiple sequencing technologies.
Findings
The nuclear genome is 570.9 Mb with 51.28% repeats and 97.7% BUCSO completeness.
The male karyotype is 2n = 56 with 17,979 protein-coding genes identified.
The mitochondrial genome includes annotations for 13 protein-coding genes, 22 tRNAs, and 2 rRNA.
Abstract
The plum fruit moth Grapholita funebrana (Tortricidae, Lepidoptera) is an important pest of many wild and cultivated stone fruits and other plants in the family Rosaceae. Here, we assembled its nuclear and mitochondrial genomes using Illumina, Nanopore, and Hi-C sequencing technologies. The nuclear genome size is 570.9 Mb, with a repeat rate of 51.28%, and a BUCSO completeness of 97.7%. The karyotype for males is 2n = 56. We identified 17,979 protein-coding genes, 5,643 tRNAs, and 94 rRNAs. We also determined the mitochondrial genome of this species and annotated 13 protein-coding genes, 22 tRNAs, and 2 rRNA. These genomes provide resources to understand the genetics, ecology, and genome evolution of the tortricid moths.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2- —Beijing Key Laboratory of Environmentally Friendly Management on Pests of North China Fruits (BZ0432)
- —https://doi.org/10.13039/501100001809National Natural Science Foundation of China (National Science Foundation of China)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInsect-Plant Interactions and Control · Plant and animal studies · Plant Virus Research Studies
Background & Summary
The plum fruit moth Grapholita funebrana is an important fruit borer from the family Tortricidae of Lepidoptera^1,2^. Larvae of G. funebrana cause damage by boring the fruits of many wild and cultivated stone fruits and other plants in the family Rosaceae, such as apricot, cherry, peach, and plum^3^. This species is native to Europe and currently found in fruit-growing regions of Europe, northern Africa, and Asia^4^. In the orchards, G. funebrana often co-occur with other fruit borers, such as the oriental fruit moth Grapholita molesta (Busck), the codling moth Cydia pomonella, and peach fruit moth Carposina sasakii Matsumura^5^. While many studies have focused on the biology and management of fruit borers, research on G. funebrana is lagging behind^6–10^. In addition, moths from the family Tortricidae are ideal for unveiling the evolution of chromosome fusion^11,12^. While species from the order Lepidoptera often have a conserved chromosome number of n = 31, in the Tortricidae family, many species have a reduced number of chromosomes due to the fusion of chromosome pairs^13,14^. Recent research has found that a common ancestor of the suborders Tortricinae and Olethreutinae diverged from the ancestral lepidopteran chromosome pattern due to a fusion of sex chromosomes with autosomes^15^. The karyotype of tortricid moths was traditionally studied by cytogenetic methods and fluorescence in situ hybridization^15^. Determining the genome sequences will improve understanding of the molecular evolution of chromosomes of tortricid moths^16^. Currently, chromosome-level genomes have been published for the C. pomonella^16^, and G. molesta^17^, as well as many publicly available assemblies for Tortricidae in the GenBank (https://www.ncbi.nlm.nih.gov/datasets/genome/?taxon=7139).
In this study, we assembled a chromosome-level genome for the G. funebrana as well its mitochondrial genome using Oxford Nanopore Technologies (ONT) long-read sequencing, Illumina short-read sequencing, high-throughput chromatin conformation capture (Hi-C) sequencing, and RNA-sequencing (RNA-seq). We yielded a nuclear genome assembly of 570.9 Mb, with an N50 of 21 Mb. These high-quality genomes will provide invaluable resources for the study of G. funebrana and in-depth investigation of chromosome evolution on macroevolutionary and microevolutionary levels.
Methods
Material and sequencing
Apricot (Prunus armeniaca) fruits with G. funebrana larvae were collected from Yanqing, Beijing, China, and reared in the laboratory for about 30 days to obtain specimens of different developmental stages. To decrease the effect of heterozygosity, a single larva was used for long-read, short-read, and Hi-C library construction. Single larva, pupa, and adult (unknown sex) were collected for the construction of RNA-seq libraries, respectively. All samples were immediately flash-frozen in liquid nitrogen and stored at −80 °C for subsequent experiments.
Genomic DNA was extracted using the Magnetic bead method (Invitrogen, Thermo Fisher Scientific, USA), while RNA was extracted using RNAprep Pure Plus Kit (Tiangen, China), respectively. The quantity of DNA was measured using Qubit 3.0. To generate short-read data for the genome survey, an Illumina library with an insert size of 350 bp was constructed and sequenced on the Illumina NovaSeq 6000 platform. To perform de novo genome assembly, a 15~20 kb ONT library was prepared and sequenced on the ONT platform to generate long-read data. To generate the Hi-C data, tissue from a larva was fixed with paraformaldehyde and digested with restriction enzymes DnpII, generating fragments with sticky ends. These sticky ends were repaired using DNA polymerase and ligated together to form chimeric circles using DNA ligase. The ligated DNAs were then decrosslinked, purified, and sheared into 350 bp insertion size. The Hi-C sequencing library was sequenced on the Illumina NovaSeq 6000 platform to generate 150-bp paired-end reads. Paired-end libraries were constructed using the VAHTSTM mRNA-seq V2 Library Prep Kit (Vazyme, Nanjing, China) and then sequenced on the Illumina NovaSeq 6000 platform with PE reads of 150 bp for genome annotation. A total of 33.7 Gb Illumina short read, 69.7 Gb ONT long-read, 58.3 Gb Hi-C reads, and 21.9 Gb RNA-seq reads data were generated. The raw data of Illumina reads were filtered by Fastp v0.21.0^18^ with default parameters.
Genome survey
Genome survey was performed using a k-mer based method. The k-mer coverage was counted from Illumina short reads using Jellyfish version 2.2.10^19^ with parameters: ‘count -m 21 -C -s 5 G’. Genome size, heterozygosity, and duplication rate were estimated using GenomeScope version 2.0^20^. The results showed a genome size about 515 Mb, a heterozygosity rate of 1.91%, and a duplication rate of 1.21%.
Genome assembly
The Nanopore long reads were assembled to the primary set of nuclear genome contigs using NextDenovo v2.5.1^21^ with parameters: ‘read_cutoff = 1k, genome_size = 400 m, pa_correction = 20, nextgraph_options = -a 1’. The contigs contain 215 sequences, with a size of 594 Mb, and N50 of 6.6 Mb. Due to the high error rate of assembly based on ONT reads, the primary contigs were polished using NextPolish 1.4.1^22^ with one round based on long reads and one round based on short reads. To achieve chromosome-level assembly, the polished contigs were anchored into pseudomolecules based on Hi-C reads information. Specifically, the Hi-C reads were mapped to contigs using Chromap 0.2.4^23^ with options: “–preset hic–remove-pcr-duplicates–trim-adapters–SAM”. The SAM output was sorted by read name and output to BAM format using Samtools v1.17^24^ with options: “sort -n -O BAM”. Yahs v1.2a.1^25^ and Juicerbox 1.22.01^26^ were then used for unsupervised and supervised scaffolding, respectively. After scaffolding, most contigs (95.3% contigs and 99.86% base-pairs) were anchored into 28 pseudo-chromosomes (Fig. 1a), consistent with the karyotype of most species in the subfamily Olethreutinae. To fill the gaps between contigs, we performed two rounds of polishing based on long- and short-reads using Nextpolish. The final assembly has a genome size of 570.9 Mb, with a N50 of 21 Mb. The assembled genome is 56.9 Mb larger than the estimated genome size. MitoZ v3.6 pipeline^27^ was performed to assembly using Megahit v1.29^28^ (“–kmers_megahit 39 59 79 99 119 141–requiring_taxa Lepidoptera”) and annotate mitochondrial genome. The mitochondrial genome of G. funebrana was 15,488 bp in length and contain 13 protein coding genes, 22 tRNA genes and 2 rRNA genes (Fig. 1b).Fig. 1. The interaction heat map of nuclear genome (a), and distribution of genes and read coverage on mitochondrial genome (b).
Genome annotations
For repeat sequence annotation, a species-specific repeat library was generated using RepeatModeler v2.0.4^29^ with options: “-LTRStruct”. The species-specific repeat library, a RepBase database, and a repeat element library for Arthropoda from the Dfam database were then combined and passed to RepeatMasker v4.1.4^30^ for repeat annotation. RepeatMasker was performed with options:” -no_is -norna -xsmall -q”.
For gene structure annotation, we performed a pipeline integrating RNA-seq-based, ab initio, and homolog-based methods. The RNA reads of single larva, pupa and adult libraries were mapped to our final assembly with Hisat v2.2.0^27^ and assembled to transcripts with Stringtie v2.1.2^31^. The transcriptome assemblies and protein sequences of Plutella xylostella (Accession: GCA_932276165.1^32^) were provided as evidence to MAKER v3.01.04 pipeline^26^ to integrate. SNAP v2013-02-16^28^ and Augustus v3.2.3^29^ were used to conduct ab initio annotation. Transfer RNA (tRNA) was predicted using tRNAscanSE 2.0.12^33^ with default parameters, and ribosome RNA (rRNA) was predicted using Barrnap 0.9 (https://github.com/tseemann/barrnap). The above gene models were merged to produce consensus models by EvidenceModeler v2.1.0^33^. Functional annotation of protein-coding genes was evaluated using EggNOG-mapper v2^34^.
Chromosome feature
The gene number, repeat sequence density, and Guanine-Cytosine(GC) content were calculated in 500 Kb non-overlapping sliding windows using Bedtools v2.30.0^35^. The name of the chromosomes was assigned as lepidopteran ancestral linkage groups^14^, based on homology to Sesia bembeciformis^36^. The homology was detected using LAST^37^ alignment. A Circos plot of chromosome feature was generated by TBtools v2.021^38^ (Fig. 2a).Fig. 2. Chromosome features of Grapholita funebrana genome. (a) Circos plot of GC content, gene count, and repeat content. Chromosomes were labeled using Merian elements according to the homology with the Lepidopteran ancestral linkage groups^14^. (b) Synteny blocks between the G. funebrana and G. molesta reveal the same number of chromosomes and highly conserved gene order in the two moths. The chromosomes of two genomes were numbered according to their length. The grey lines show the synteny blocks between two genomes.
Data Records
Illumina, Nanopore, Hi-C, and transcriptome data for G. funebrana genome sequencing have been deposited in the NCBI Sequence Read Archive with accession number SRP482231^39^. The final assembled nuclear genome of G. funebrana has been deposited in the NCBI Genbank with accession number GCA_038095595.1^40^. The mitochondrial genome has been deposited in the NCBI Genbank with accession number PP776023^41^. The genome assembly and annotation files are available in Figshare^42^.
Technical Validation
The Hi-C heatmap revealed a well-structured interaction pattern. Short-read sequencing data were mapped to the final assembly with BWA v0.7.17^43^, revealing a mapping rate of 97.7%. The completeness of G. funebrana genome assembly was evaluated using the BUSCO^44^ base on the lepidoptera_odb10 database (n = 5286). The completeness of the initial assembly (contig level) was 90.9%, while it increased to 97.7% (97.2% single-copied genes, 0.5% duplicated genes, 0.6% fragmented, and 1.7% missing genes) after polishing with NextPolish^22^ (Table 1). We identified 14,547 protein-coding genes, 11,673 of which were functionally annotated. The completeness of the annotated gene set was 95.8% (94.8% single-copied genes and 1.0% duplicated genes, 1.1% fragmented, and 3.1% missing genes). A synteny analysis between G. funebrana and G. molesta^17^ was performed using MCSCAN in JCVI package^45^. Strong syntenic blocks were found between the two closely related species (Fig. 2b). All evidence strongly supported the completeness and accuracy of G. funebrana genome assembly.Table 1. Statics of G. funebrana genome assembly.ItemContigPurged contigHi-C raised scaffoldPolished scaffoldNo. of contigs2151752828Size (Mb)593.9580.3579.6570.9N50 (Mb)6.67.221.421.0GC content37.8%37.6%37.6%37.5%Single-copy BUSCOs90.2%90.9%90.5%97.2%Duplicated BUSCOs0.7%0.4%0.3%0.5%Fragmented BUSCOs4.4%4.4%4.4%0.6%Missing BUSCOs4.7%4.7%4.8%1.7%
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Li L-L Functional disparity of four pheromone-binding proteins from the plum fruit moth Grapholita funebrana Treitscheke in detection of sex pheromone components Int. J. Biol. Macromol.20232251267127910.1016/j.ijbiomac.2022.11.18636423808 · doi ↗ · pubmed ↗
- 2Lo Verde G Guarino S Barone S Rizzo R Can mating disruption be a possible route to control plum fruit moth in mediterranean environments?Insects 20201158910.3390/insects 1109058932882909 PMC 7564571 · doi ↗ · pubmed ↗
- 3Dickler, E. Tortricid pests of pome and stone fruits, eurasian species. in Tortricids Pests, Their Biology, Natural Enemies and Control (eds. van der Geest, L. P. S. & Evenhuis, H. H.) 435–452 (Elsevier, Amsterdam, Netherlands, 1991).
- 4F, K. A taxonomic review of the genus Grapholita and allied genera (Lepidoptera: Tortricidae) in the Palaearctic region. Ent. Scand. Suppl. 55, 110 (1999).
- 5Chen MH Dorn S Reliable and efficient discrimination of four internal fruit-feeding Cydia and Grapholita species (Lepidoptera: Tortricidae) by polymerase chain reaction-restriction fragment length polymorphism J. Econ. Entomol.20091022209221610.1603/029.102.062520069850 · doi ↗ · pubmed ↗
- 6Ioriatti C Toxicity of emamectin benzoate to Cydia pomonella (L.) and Cydia molesta (Busck) (Lepidoptera: Tortricidae): laboratory and field tests Pest Manag. Sci.20096530631210.1002/ps.168919097022 · doi ↗ · pubmed ↗
- 7Liu J Reverse chemical ecology guides the screening for Grapholita molesta pheromone synergists Pest Manag. Sci.20227864365210.1002/ps.667434658157 · doi ↗ · pubmed ↗
- 8Stelinski LL Il’ichev AL Gut LJ Efficacy and release rate of reservoir pheromone dispensers for simultaneous mating disruption of codling moth and oriental fruit moth (Lepidoptera: Tortricidae)J. Econ. Entomol.200910231532310.1603/029.102.014219253651 · doi ↗ · pubmed ↗
