Genome-Wide Identification and Comparative Characterization of Chemosensory Gene Families in Two Phthorimaea Pests
Wangtao Hu, Ruipeng Chen, Qi Su, Yulin Gao

TL;DR
This study identifies and compares chemosensory genes in two pest species, revealing sex- and tissue-specific expression patterns that could help in developing pest control strategies.
Contribution
The study provides a comprehensive identification and comparative analysis of chemosensory gene families in two Phthorimaea pest species, with insights into gene expression patterns.
Findings
47 OBPs, 26 CSPs, and 2 SNMPs were identified in Phthorimaea operculella.
Sex-biased expression was observed in antennae and reproductive tissues, with specific genes upregulated in females and males.
Candidate genes for olfaction and reproduction-related behaviors were identified, offering targets for pest management.
Abstract
The potato tuber moth (Phthorimaea operculella) and the tomato leafminer (Phthorimaea absoluta) are major solanaceous-crop pests. These insects depend on chemosensory genes to detect odors and select hosts and oviposition sites. In this study, we identified three important chemosensory gene families (OBPs, CSPs, and SNMPs) from the genomes of both species and analyzed their evolutionary relationships. Using RNA-seq data from P. operculella, we further examined gene expression patterns across tissues and sexes. Several genes showed strong tissue- and sex-biased expression, especially in antennae and reproductive organs, suggesting roles in olfaction and reproduction-related behaviors. This work provides a useful gene resource and candidate targets for future functional studies and behavior-based pest management. Insects rely on their olfactory systems for host finding, mate choice, and…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5- —National Key Research and Development Program of China
- —Science and Technology Innovation Project of the Chinese Academy of Agricultural Sciences
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurobiology and Insect Physiology Research · Insect-Plant Interactions and Control · Insect Pheromone Research and Control
1. Introduction
Insect chemosensation is organized as a two-tier peripheral system: peri-receptor soluble carriers—odorant-binding proteins and chemosensory proteins (OBPs/CSPs)—bind and shuttle ligands to membrane receptors (odorant, gustatory, and ionotropic receptors (ORs/GRs/IRs)) on olfactory sensory neurons, where signal transduction is initiated. At the peri-receptor interface, odorant-binding proteins (OBPs) and chemosensory proteins (CSPs) capture, solubilize, buffer, and shuttle hydrophobic ligands within the sensillar lymph, shaping stimulus kinetics and widening the dynamic range presented to receptors [1,2]. On the membrane side, odorant receptors (ORs) operate with their co-receptor Orco as heteromeric ligand-gated cation channels, while sensory neuron membrane proteins (SNMP1/2; CD36 family) cooperate with chemoreceptors to facilitate detection and transfer of hydrophobic semiochemicals [3,4,5]. Although antennae are the dominant olfactory organs, chemosensory functions also occur in maxillary and labial palps, the proboscis, legs, wing margins, and the ovipositor, enabling both long-range orientation and near-field appraisal during feeding and egg-laying [2].
OBPs and CSPs form the principal soluble-carrier strata and exhibit recognizable sequence architectures and subfamilies. OBPs partition into classic, Minus-C, and Plus-C types with distinct cysteine motifs and inferred disulfide topologies that likely influence ligand selectivity and release dynamics [6,7,8]. CSPs are shorter secreted carriers with a conserved four-cysteine core [9]. Both families often show lineage-specific expansion and tandem chromosomal clustering, consistent with adaptive diversification to diet breadth, pheromone chemistry, and habitat [9,10,11]. In the membrane tier, SNMP1 is typically associated with olfactory receptor neurons and facilitates rapid detection and transfer of hydrophobic semiochemicals, whereas SNMP2 is enriched in non-neuronal accessory cells and thought to assist with ligand movement and clearance, maintaining homeostasis in the sensillar microenvironment [3,4,5]. Together, these elements are organized into a configurable peri-to-receptor workflow comprising ligand capture by OBPs and CSPs; conveyance and presentation to membrane-resident OR, GR, and IR complexes with assistance from SNMPs; receptor activation; and rapid signal clearance or termination. This workflow is retuned across tissues, developmental stages, and physiological states [1].
The potato tuber moth P. operculella and the tomato leaf miner P. absoluta (Lepidoptera: Gelechiidae) are globally important solanaceous pests with complementary life histories [12,13,14,15]. P. operculella oviposits on leaves, stems, and tubers; larvae mine foliage and penetrate tubers, compromising yield and storage quality [13]. In tomato, aerial tissues, including leaves, stems, and fruits, are predominantly damaged by P. absoluta; in potato, feeding is largely restricted to above-ground organs, constituting a key ecological distinction from P. absoluta [12,14]. Host location, mate finding, and oviposition are odor-guided behaviors mediated by plant volatiles and pheromones via the peripheral chemosensory system. Accordingly, behavior-based and environmentally compatible control strategies are informed by the deployment of soluble carriers (OBPs and CSPs) and membrane components and cofactors (such as ORs and SNMPs) in these species [12,13,14].
Despite the availability of high-quality genomes and accumulating antennal datasets, the composition and organization of OBP, CSP, and SNMP repertoires, and their tissue- and sex-biased deployment, remain insufficiently resolved, particularly within reproductive tissues, where near field chemosensation is expected to contribute to site selection [16,17]. Here, we address two core gaps: (i) a curated, genome-guided identification of OBP, CSP, and SNMP families in P. operculella and P. absoluta and (ii) a systematic characterization of their expression patterns across antennae and reproductive tissues.
Here, a tissue-resolved survey of the OBP, CSP, and SNMP families in P. operculella is presented, anchored by a comparative catalogue in P. absoluta. In P. operculella, 47 OBPs (Classic, Minus C, and Plus C), 26 CSPs (each with an N-terminal signal peptide and a conserved four-cysteine core, with many genes arranged in tandem clusters on a single chromosome), and two SNMPs (SNMP1 and SNMP2) were annotated; in P. absoluta, a parallel catalogue of 39 OBPs and 24 CSPs was assembled. Genome-based annotation was integrated with phylogenetic analysis to delimit family membership and subfamily boundaries; expression was quantified across developmental stages and chemosensory tissues; and sex-biased expression in antennae (female vs. male), together with tissue bias in reproductive organs (female ovipositors vs. male genitalia), was evaluated in P. operculella. On this basis, ovipositor-enriched soluble carriers and SNMP candidates were nominated for a functional follow-up, within the framework of chemosensory-related proteins underlying odor-guided behaviors. Our work provides tractable molecular entry points for ligand-binding assays, genetic perturbation, and behavioral evaluation, supporting chemoreception-based, environmentally compatible pest management in solanaceous systems.
2. Materials and Methods
2.1. Insect Rearing and Tissue Collection
In the laboratory, the rearing conditions were as follows: temperature, 26 ± 1 °C; relative humidity, 60 ± 10%; and photoperiod, 12 h light:12 h dark in a climate-controlled chamber. The larvae were reared using potatoes and placed with the adults in nylon cages. The amounts of head tissue used for each larval stage were as follows: L1 (whole heads of larvae, corresponding to approximately 50 individuals), L2 larvae (with approximately 90 heads dissected), L3 larvae (with approximately 80 heads dissected), and L4 larvae and mature larvae (with approximately 50 heads dissected). The antennae (80 pairs for each sex), heads (40 females), legs (40 females), male genitalia (40 males), and female ovipositors (40 females) were separately excised from 2–3 day old adults and immediately frozen in liquid nitrogen and stored at −70 °C until use.
2.2. Transcriptome Sequencing
Total RNA was extracted using TRIzol Reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s protocol. RNA concentration and purity were determined using a NanoDrop ND-2000 spectrophotometer (Thermo Fisher Scientific, Applied Biosystems, Waltham, MA, USA), and integrity was verified using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Approximately 10 µg of total RNA per sample was used for library construction.
RNA-seq libraries were prepared using the MGIEasy RNA Library Prep Kit(MGI Tech, Shenzhen, China) and sequenced on the Illumina HiSeq X Ten platform (paired-end 150 bp) by the Beijing Genomics Institute (BGI, Shenzhen, China). Raw reads were quality-checked using FastQC [18], and adaptor contamination, low-quality reads, and poly-N sequences were removed using SOAPnuke (v1.6.5) [18]. High-quality clean reads were used for all downstream analyses. The larval sequencing data were deposited in the NCBI Sequence Read Archive under accession number PRJNA1372393. Adult sequencing data were downloaded from the NCBI Sequence Read Archive under BioProject accession PRJNA1074269 [19].
2.3. Identification of Chemosensory Genes
Chemosensory gene sequences from 16 lepidopteran species, including Bombyx mori, Plutella xylostella, Chilo suppressalis, Ostrinia furnacalis, Helicoverpa armigera, Helicoverpa zea, Galleria mellonella, Eogystia hippophaecolus, Spodoptera exigua, Carposina sasakii, Mythimna separata, Manduca sexta, Loxostege sticticalis, Danaus plexippus, Heliconius melpomene, and Spodoptera litura, were retrieved from public databases (NCBI) to construct a comparative reference dataset (Supplementary Table S1) [20].
Candidate genes in P. operculella and P. absoluta were identified using TBLASTN (v2.14.0) with an E-value threshold of 1 × 10^−5^ [21]. Gene models were refined using Genewise [22], and genomic locations were manually verified. RNA-seq reads were aligned to the reference genome using HISAT2 (v2.1.0) [23] and assembled with StringTie (v1.0.4) to reconstruct transcript models [24]. Using transcript evidence, exon and intron boundaries were refined to produce full-length coding sequences.
Conserved domains were annotated with InterPro and InterProScan [25,26]; signal peptides were predicted with SignalP 5.0 [27]; and transmembrane helices and topology were predicted using DeepTMHMM [28].
2.4. Sequence Alignment and Phylogenetic Analysis
Open reading frames (ORFs) were predicted using TBtools (v2.034), and signal peptides for OBPs and CSPs were predicted with SignalP 5.0 [29]. Amino acid sequences were translated using Expasy and aligned with MAFFT (v7.164b) [2]. Phylogenetic trees were built in IQ-TREE (v2.0.7) using the best substitution model selected by ModelFinder [30,31], and nodal support was assessed with 1000 bootstrap replicates [32]. The resulting trees were visualized in iTOL (v5) [33]. OBPs were classified into Classic, Minus C, and Plus C subfamilies based on conserved cysteine motifs. CSPs showed the conserved four-cysteine core and N-terminal signal peptides, while SNMPs exhibited CD36 family features, including two transmembrane domains.
2.5. Differential Expression and Expression Profiling
Clean reads were mapped to the P. operculella reference genome using STAR (v2.7.10b) [34]. Gene expression levels were quantified as transcripts per million (TPM) with RSEM (v1.3.3) [35]. Expression profiles were visualized using the pheatmap (v1.0.12) package in R (v4.2.0) [36,37]. Differential expression analysis was performed using DESeq2 (v1.36.0), with significance thresholds set at |log_2_ FoldChange| > 1 and FDR < 0.05 [38,39]. Each tissue type was analyzed with three biological replicates.
To enable visualization of expression profiles, TPM values were transformed as follows: log10(TPM + 0.001); then, they were row-wise z-scored to equalize dynamic ranges across genes. Heatmaps were generated using the pheatmap (v1.0.12) package in R (v4.2.0), with the color scale fixed at −3 to 3. Columns consisted of larval samples at different developmental stages (L1; L2H/L2G; L3H/L3G; L4H/L4G/L4L; LL4H/LL4G/LL4B/LL4X/LL4Z) and adult tissues (female antenna, FAn; male antenna, MAn; female ovipositor, FOv; male genitalia, MGe; female foreleg, and FFo; female head, FHe).
For targeted analysis, expression levels of OBP, CSP, and SNMP genes in the female ovipositor were extracted from the same RNA-seq dataset, enabling direct comparison with antennae and other tissues.
3. Results
3.1. Identification of Chemosensory-Related Genes
3.1.1. Odorant-Binding Proteins
A total of 47 OBPs were annotated in the genome of P. operculella. The 25 OBPs were newly identified and named PopeOBP23 to PopeOBP47 (Supplementary Table S2). The newly identified OBPs were named sequentially based on the previously identified OBPs in the laboratory [38]. Among the 47 reported PopeOBPs, forty-four possess full-length nucleotide sequences and encode proteins ranging from 103 aa (PopeOBP25) to 707 aa (PopeOBP27); the remaining three (PopeOBP32, PopeOBP37, and PopeOBP45) lack full-length nucleotide sequences and encode proteins of 65 (PopeOBP45) to 178 aa (PopeOBP32) (Supplementary Table S2). PopeOBP01, PopeOBP03, PopeOBP20, PopeOBP25, PopeOBP29, PopeOBP33, PopeOBP34, PopeOBP37, PopeOBP41, PopeOBP43, PopeOBP44, and PopeOBP45 did not contain predicted signal peptides. Phylogenetic analysis revealed that the PopeOBPs were distributed among different evolutionary clades (Figure 1). A total of 37 PopeOBPs were classified into the classical OBP subfamily, characterized by six conserved cysteine residues and the typical cysteine motif C1-X15–39-C2-X3-C3-X21–24-C4-X7–12-C5-X8-C6 (Figure 2a).
Five genes (PopeOBP08, PopeOBP13, PopeOBP23, PopeOBP24, and PopeOBP44) belonged to the Minus-C OBP subfamily, with all of them having full-length sequences. Among them, PopeOBP08, PopeOBP13, PopeOBP23, and PopeOBP24 were predicted to possess signal peptide sequences. These proteins exhibited the conserved cysteine pattern C1-X30-C2-X38–44-C3-X18–19-C4 (Figure 2; Supplementary Table S2).
The remaining five candidate OBPs (PopeOBP27, PopeOBP29, PopeOBP33, PopeOBP38, and PopeOBP47) were identified as Plus-C OBPs. PopeOBP27, PopeOBP38, and PopeOBP47 were predicted to possess signal peptides and exhibited the characteristic C-P-X9-C motif at the C terminus. Signal-peptide-rich N-termini are evident in most full-length sequences, consistent with secretion to the sensillar lymph (Figure 2c; Supplementary Table S2).
A total of 39 OBPs were annotated from the genome of P. absoluta. All of them were newly identified and named PabsOBP01 to PabsOBP39, and all of them possessed full-length nucleotide sequences and encoded predicted proteins of 90–669 aa in length. Among them, 27 of them contain a putative N-terminal signal peptide (Supplementary Table S2). PabsOBP01, PabsOBP12, PabsOBP16, PabsOBP18, PabsOBP20, PabsOBP22, PabsOBP23, PabsOBP24, PabsOBP25, PabsOBP28, PabsOBP37, and PabsOBP38 lacked a predicted signal peptide (Supplementary Table S2). Phylogenetic analysis showed that the PabsOBPs were distributed across multiple evolutionary clades (Figure 1).
A total of 32 PabsOBPs were grouped into the classical OBP subfamily, which is characterized by six conserved cysteine residues and the typical structural motif C1-X15–39-C2-X3-C3-X21–24-C4-X7–12-C5-X8-C6 (Figure 2a, Supplementary Table S2). Alignment of Pabs OBPs confirmed the canonical six-cysteine motif across classic members, whereas the Minus-C and Plus-C subfamilies showed the expected loss or gain of cysteines. Five genes, namely, PabsOBP1, PabsOBP2, PabsOBP3, PabsOBP5, and PabsOBP6, were assigned to the Minus-C OBP subfamily. All of them possessed full-length nucleotide sequences, and PabsOBP2, PabsOBP3, PabsOBP5, and PabsOBP6 contained predicted signal peptide sequences (Figure 2b, Supplementary Table S2).
3.1.2. Chemosensory Proteins
A total of 26 chemosensory proteins (PopeCSPs) were identified in the genome of P. operculella. All 26 CSPs were newly identified and named PopeCSP1 to PopeCSP26. The newly identified proteins were named according to their chromosomal positions. All PopeCSPs contained complete open reading frames, encoding proteins ranging from 76 to 313 amino acids in length. Each protein harbored a signal peptide and the four highly conserved cysteine residues characteristic of secreted CSP family members (Figure 3, Supplementary Table S2). All PopeCSPs were full-length sequences, ensuring the integrity of their coding regions.
A total of 24 CSP genes (PabsCSP01-PabsCSP24) were identified from the genome of P. absoluta. All PabsCSPs contained complete open reading frames encoding proteins ranging from 86 to 487 amino acids in length. Among them, 21 sequences possessed predicted N-terminal signal peptides (Supplementary Table S2). Sequence alignment revealed that all PabsCSPs exhibited the four highly conserved cysteine residues characteristic of the CSP family (Figure 3), with a typical motif pattern of C1–X6–8–C2–X18–19–C3–X2–C4. All PabsCSPs were full-length sequences, supporting their functional relevance (Figure 3, Supplementary Table S2).
Phylogenetic analysis of PabsCSPs and PopeCSPs revealed distinct evolutionary lineages for each species. The PabsCSPs (from P. absoluta) and PopeCSPs (from P. operculella) were both classified into separate clades, suggesting species-specific evolutionary paths for their CSP gene families.
In the phylogenetic tree, PabsCSPs clustered with other CSPs from closely related species within the genus Phthorimaea, forming a well-supported group, while PopeCSPs clustered closely with PabsCSPs and were extensively intermingled on the tree, indicating overall conservation of CSP evolution between the two species (Figure 3).
3.1.3. Sensory Neuron Membrane Proteins
A total of two SNMP genes, PopeSNMP1 and PopeSNMP2, were identified from the genome of P. operculella. The SNMPs were named based on phylogenetic analysis. Phylogenetic analysis (Figure 4) divided the SNMPs into two distinct clades, corresponding to SNMP1 and SNMP2. PopeSNMP1 clustered with P. absoluta PabsSNMP1 and other Lepidopteran SNMP1s, forming a well-supported lineage typically associated with olfactory-receptor neurons. In contrast, PopeSNMP2 grouped with PabsSNMP2, SlitSNMP2, LstiSNMP2, and PxyISNMP2 within the SNMP2 clade, which is generally expressed in supporting cells and thought to play a role in hydrophobic ligand transport or clearance (Figure 4).
3.2. Tissue-Specific Expression Patterns Analysis
3.2.1. Global Transcriptome Structure Revealed by PCA
Principal component analysis (PCA) of transcriptomes across all sampled developmental stages and tissues revealed clear separation based on tissue type and developmental stage (Supplementary Figure S1). The first two principal components (PC1 and PC2) accounted for 17.15% and 12.26% of the total variance, respectively. The samples clustered primarily by tissue origin; for instance, larval heads (L2H, L3H, L4H, and LL4H) formed a distinct group, while adult chemosensory tissues such as antennae (FAn, MAn) and reproductive organs (FOv, MGe) clustered separately. This global expression divergence underscores the tissue-specific regulatory landscape within which the chemosensory genes are deployed (Supplementary Figure S1).
3.2.2. OBP Tissue-Specific Expression Patterns Analysis
PopeOBP genes exhibited pronounced developmental- and tissue-specific expression patterns. Overall, most highly expressed genes were concentrated in larval heads and adult antennae, while several members also showed specific or relatively high expression in the body wall/fat body and reproductive tissues.
During the larval stages, a stable set of olfaction-related PopeOBP genes, including PopeOBP15, PopeOBP36, PopeOBP01, PopeOBP43, PopeOBP25, and PopeOBP37, was consistently highly expressed in the heads of L1–L4 larvae and mature larvae (L1, L2H, L3H, L4H, and LL4H), with additional higher expression of PopeOBP04 and PopeOBP47/PopeOBP29 in certain instars. The persistent high expression of these genes across multiple larval-head samples suggests that they may participate in the detection of volatile semiochemicals during larval stages and contribute to host location and feeding-related behaviors. In contrast, only very few PopeOBP genes were highly expressed in larval gut tissues: no markedly high expression was detected in L2G and L3G, and only PopeOBP04 showed appreciable expression in L4G. In LL4G, the overall expression levels of PopeOBP genes were low, with only weak expression of PopeOBP29. These results indicate that, during larval development, the primary functional sites of PopeOBP genes are head-associated olfactory tissues rather than the digestive tract (Figure 5a, Supplementary Tables S3 and S4).
In other larval body tissues, several PopeOBP members showed a broader tissue distribution. For example, in the hemolymph of fourth-instar larvae (L4L), PopeOBP29, PopeOBP08, PopeOBP04, PopeOBP38, and PopeOBP42 were highly expressed; in the fat body/body wall in mature larvae (LL4B), PopeOBP02, PopeOBP47, PopeOBP38, and PopeOBP14 were clearly enriched; and in other mature larval body parts (LL4X and LL4Z), PopeOBP42, PopeOBP29, PopeOBP27, PopeOBP38, and PopeOBP14 were predominantly detected. The relatively high expression of these genes in non-canonical olfactory tissues may be associated with the formation of chemical barriers at the body surface or with the binding and transport of endogenous metabolites or non-volatile compounds, suggesting that the PopeOBP family in larvae may participate not only in odor perception but also in broader physiological processes (Figure 5a, Supplementary Tables S3 and S4).
In adults, PopeOBP genes were highly expressed in the antennae. In female antennae (FAn), numerous PopeOBP genes showed high expression, including PopeOBP29**,** PopeOBP02, PopeOBP13, PopeOBP35, PopeOBP26, PopeOBP28, PopeOBP09, PopeOBP32, PopeOBP22, PopeOBP30, PopeOBP05, PopeOBP07, PopeOBP10, PopeOBP46, PopeOBP06, PopeOBP21, PopeOBP35, PopeOBP39, PopeOBP34, and PopeOBP41, among which PopeOBP13 displayed the highest expression level, suggesting a key role in the detection of host-related odors or oviposition-associated cues in females. Male antennae (MAn) also expressed a large number of PopeOBP genes. PopeOBP32**,** PopeOBP09**,** PopeOBP35, PopeOBP26, PopeOBP28, PopeOBP22, PopeOBP30, PopeOBP05, PopeOBP07, PopeOBP10, PopeOBP46, PopeOBP06, PopeOBP21, PopeOBP45, PopeOBP39, PopeOBP34, and PopeOBP41 were shared with female antennae, whereas PopeOBP20, PopeOBP11, PopeOBP18, PopeOBP12, PopeOBP19, and PopeOBP47 were highly expressed only in males. Notably, PopeOBP45 exhibited the highest expression level in male antennae, and this male-biased high expression suggests that it may be intricately involved in the perception of sex pheromones or other mating-related chemical signals (Figure 5a, Supplementary Tables S3 and S4).
In reproductive and non-olfactory adult tissues, a subset of PopeOBP genes also showed prominent expression. In ovipositors (FOv), PopeOBP02, PopeOBP24, and PopeOBP44 were highly expressed; in male genitalia (MGe), PopeOBP29, PopeOBP18, PopeOBP24**,** PopeOBP05**,** PopeOBP07, and PopeOBP44 were enriched; and in female forelegs (FFo), PopeOBP18**,** PopeOBP24**,** PopeOBP44, and PopeOBP08 were predominantly expressed. In addition, in female heads without antennae (FHe), relatively high expression of PopeOBP29, PopeOBP02, PopeOBP47, PopeOBP30, PopeOBP05, PopeOBP07, and PopeOBP27 was detected. Taken together, PopeOBP18, PopeOBP24, PopeOBP44, and PopeOBP29 repeatedly appeared as highly expressed genes in antennae, reproductive organs, and leg tissues, indicating that they may act in a coordinated manner in mate recognition, mating behavior, and oviposition-related behaviors. Overall, the differential expression of PopeOBP genes between larval and adult stages, as well as between olfactory and non-olfactory tissues, reflects their diverse functional roles across life stages and in multiple behavioral and physiological processes (Figure 5a, Supplementary Tables S3 and S4).
Based on DESeq2 analysis of transcriptomic data from female antennae (FAn) and male antennae (MAn), differentially expressed genes were identified using |log_2_ (fold change)| > 1 and FDR < 0.05 as the filtering thresholds, and only members of the PopeOBP gene family were considered (Supplementary Table S3). In total, 24 PopeOBP genes were found to be significantly differentially expressed between female and male antennae, of which 10 were male-biased (PopeOBP01, PopeOBP05, PopeOBP06, PopeOBP07, PopeOBP12, PopeOBP14, PopeOBP22, PopeOBP30, PopeOBP34, and PopeOBP45) and 14 were female-biased (PopeOBP08, PopeOBP09, PopeOBP11, PopeOBP16, PopeOBP17, PopeOBP21, PopeOBP23, PopeOBP26, PopeOBP28, PopeOBP29, PopeOBP35 PopeOBP39, PopeOBP41, and PopeOBP46). Among the male-biased genes, the log_2_(MAn/FAn) values of PopeOBP05, PopeOBP07, PopeOBP12, PopeOBP14, PopeOBP34, and PopeOBP45 were all greater than 2, corresponding to an approximately 4–50-fold upregulation in male antennae, whereas among the female-biased genes, the log_2_(MAn/FAn) values of PopeOBP08, PopeOBP09, PopeOBP11, PopeOBP16, PopeOBP17, PopeOBP21, PopeOBP23, PopeOBP26, PopeOBP28, PopeOBP29, PopeOBP35 PopeOBP39, PopeOBP41, and PopeOBP46 were less than –1, with markedly higher expression in female antennae than in male antennae (Figure 5a, Supplementary Table S3).
Based on DESeq2 analysis of transcriptomic data from ovipositor (FOv) and male genitalia (MGe), differentially expressed genes were identified using |log_2_(fold change)| > 1 and FDR < 0.05 as the filtering thresholds, with log_2_(MGe/FOv) representing the fold change. A total of 10 PopeOBP genes were significantly differentially expressed between FOv and MGe, including 6 male-biased genes (PopeOBP23, PopeOBP40, PopeOBP18, PopeOBP05, PopeOBP11, and PopeOBP08) and 3 female-biased genes (PopeOBP46, PopeOBP09, and PopeOBP29), among which PopeOBP09 was detected at high levels in FOv, indicating ovipositor-specific high expression (Figure 5a, Supplementary Table S3).
3.2.3. CSP Expression Profiles Across Tissues and Developmental Stages
Based on TPM values across all samples, the expression levels of 26 PopeCSP genes were visualized as a clustered heatmap (Figure 5b). The heatmap revealed clear spatiotemporal differences and allowed these genes to be grouped into three major expression patterns.
The first group showed larval-biased expression and comprised PopeCSP12, PopeCSP23, PopeCSP25, PopeCSP03, PopeCSP07, PopeCSP06, PopeCSP05, and PopeCSP16, which generally displayed higher normalized expression values in larval tissues (L1–L4 and LL4 samples) and lower values in adult tissues. The second group showed adult-biased expression, mainly in adult antennae and reproductive tissues, and included PopeCSP11, PopeCSP01, PopeCSP18, PopeCSP04, PopeCSP24, PopeCSP26, PopeCSP17, PopeCSP22, PopeCSP02, PopeCSP20, PopeCSP13, PopeCSP19, PopeCSP10, PopeCSP21, PopeCSP14, and PopeCSP15, which exhibited relatively low expression in most larval samples but elevated expression in FAn, MAn, FOv, and/or MGe. The third group consisted of PopeCSP08 and PopeCSP09, which showed intermediate expression patterns with moderate normalized expression levels across multiple larval and adult tissues. Overall, PopeCSP12, PopeCSP23, PopeCSP25, PopeCSP03, PopeCSP07, PopeCSP06, PopeCSP05, and PopeCSP16 were predominantly expressed in larval samples, whereas PopeCSP11, PopeCSP01, PopeCSP18, PopeCSP04, PopeCSP24, PopeCSP26, PopeCSP17, PopeCSP22, PopeCSP02, PopeCSP20, PopeCSP13, PopeCSP19, PopeCSP10, PopeCSP21, PopeCSP14, and PopeCSP15 showed higher expression in adult antennae and reproductive tissues, with PopeCSP08 and PopeCSP09 displaying broader, intermediate expression across developmental stages and tissues (Figure 5b; Supplementary Table S5).
Based on DESeq2 analysis of transcriptomic data from female antennae (FAn) and male antennae (MAn), differentially expressed genes were identified using |log_2_(fold change)| > 1 and FDR < 0.05 as the filtering thresholds, and only members of the PopeCSP gene family were considered. A total of four PopeCSP genes were found to be significantly differentially expressed between FAn and MAn, including one male-biased gene (PopeCSP14) and three female-biased genes (PopeCSP13, PopeCSP17, and PopeCSP21) (Figure 5b; Supplementary Table S3).
Based on DESeq2 analysis of transcriptomic data from ovipositors (FOv) and male genitalia (MGe), differentially expressed genes were likewise identified using |log_2_(fold change)| > 1 and FDR < 0.05 for members of the PopeCSP gene family. In total, six PopeCSP genes were significantly differentially expressed between FOv and MGe, including five male-biased genes (PopeCSP04, PopeCSP10, PopeCSP13, PopeCSP18, and PopeCSP26) and one female-biased gene (PopeCSP15) (Figure 5b; Supplementary Table S3).
3.2.4. SNMP Expression Across Tissues and Developmental Stages
PopeSNMP1 and PopeSNMP2 were expressed at low levels in larval gut tissues, showed moderate expression in certain larval head and body samples, and reached their highest levels in adult antennae, with additional moderate expression in reproductive and leg tissues. Based on DESeq2 analysis of transcriptomic data from female antennae (FAn) and male antennae (MAn), differentially expressed genes were identified using |log_2_(fold change)| > 1 and FDR < 0.05 as the filtering thresholds, and only members of the PopeSNMP gene family were considered. Only PopeSNMP2 met the differential expression criteria, and it was male-biased (Figure 5c; Supplementary Table S3).
The candidates identified were categorized into two main functional modules. The first module includes female-biased soluble carriers, enriched in the ovipositor and female antennae, and involves genes such as PopeOBP09, PopeOBP46, and PopeCSP15, which are likely involved in host-location and oviposition-site selection. The second module comprises male-biased pheromone-associated genes, including PopeOBP45, PopeOBP05, and PopeSNMP2, which are predicted to play key roles in pheromonal signaling during mating (Supplementary Table S6).
In addition, several genes exhibit complex or inconsistent expression patterns. For instance, PopeOBP23 and PopeCSP13 show conflicting expression across tissues, suggesting potential multifunctional roles or regulatory redeployment. Other genes like PopeOBP29, PopeOBP24, and PopeOBP44 are expressed throughout different tissues, but their expression patterns are less consistent with their phylogeny, pointing to potential functional diversity or involvement in broader biological processes beyond just olfaction (Supplementary Table S6).
4. Discussion
Our genome-guided survey reveals a structured chemosensory landscape in Phthorimaea operculella and Phthorimaea absoluta, with 47 OBPs, 26 CSPs, and two SNMPs in P. operculella and 39 OBPs, 24 CSPs, and two SNMPs in P. absoluta. In P. operculella antennae, this repertoire is deployed in a sex-biased manner: DESeq2 analysis identified 24 OBPs, four CSPs, and 1 SNMP with significant sexual dimorphism, of which 14 OBPs and three CSPs are enriched in female antennae, whereas 10 OBPs and one CSP, together with SNMP2, are enriched in male antennae; SNMP1 shows no detectable sex-specific bias. In reproductive tissues, chemosensory genes are further stratified, with three OBPs and one CSP enriched in ovipositors (FOv) and six OBPs and five CSPs enriched in male genitalia (MGe), while no SNMPs meet the differential-expression threshold between these tissues.
These totals align with representative Lepidoptera species, in which OBP repertoires typically span several dozen genes, CSPs number from 1 to 30, and SNMPs are stably represented by the 1/2 pair [1,39]. Chromosomal mapping in Pope revealed local tandem clusters (notably within the CSP set), congruent with lineage-specific micro-expansions via proximal duplication; similar clustered architectures occur in Cydia pomonella and are a hallmark of birth-and-death evolution in these families [40,41]. Together with the phylogenetic signal, these features support a model whereby the capacity of the soluble-carrier layer scales with the chemical complexity of host and pheromonal niches, whereas the SNMP axis remains under a stronger purifying constraint as a conserved membrane co-factor module [5,42,43,44].
The phylogeny of OBPs from P. operculella, P. absoluta, and several other lepidopteran species resolve four major clades corresponding to Classic, Minus-C, Plus-C, and PBP/GOBP OBPs (with bootstrap support > 70 for the main nodes). Rather than forming a single lineage, PopeOBPs and PabsOBPs are distributed across all three structural OBP subfamilies, indicating that both species possess a full complement of Classic, Minus-C, and Plus-C OBPs. Within the PBP/GOBP clade, several Pope and Pabs OBPs cluster together with canonical GOBP1/2 and PBP sequences from Bombyx mori and other moths, highlighting putative pheromone- and host-odor-binding candidates in Phthorimaea.
The OBP phylogeny recovers the canonical GOBP1/2 and PBP clades and the Classic/Minus-C/Plus-C structural subfamilies, providing orthology anchors for functional inference [2,45]. GOBP-like genes are typically abundant in the antennae of both sexes and often associated with broad-spectrum plant volatiles/background odors [46,47,48]. The FOv enrichment of a subset of GOBP-like members in our data suggests repurposing for near-field oviposition contexts, consistent with genetic functional evidence that GOBP2 can modulate oviposition preference [49]. PBP-like genes (e.g., PBP-A/B/C/D sublineages) are frequently male-biased in antennae and implicated in pheromone capture and release [45,50,51,52], yet detectable expression in reproductive tissues for a subset indicates that PBP functions are not strictly confined to male antennae [52,53].
Expression analyses across developmental stages and adult tissues resolved a layered and reconfigurable peripheral architecture. Within the soluble carrier tier, OBPs and CSPs were robustly expressed in antennae and reproductive tissues, consistent with roles that extend beyond canonical olfaction [1,54,55]. Female antennae frequently showed sex-biased elevations in carrier abundance, and the female ovipositor contained a focused module enriched in FOv, consistent with near-field appraisals of oviposition cues and with local buffering and transfer of hydrophobic ligands [54,56,57,58]. Plasticity of carrier deployment was observed in larval heads and feeding-related tissues, indicating coupling with feeding state and host use [35]. By contrast, the membrane cofactor tier appeared highly conserved: SNMP1 and SNMP2 were present in P. operculella and P. absoluta, with SNMP1 tending to be male-biased in antennae and aligned with rapid transfer and clearance of pheromonal components; tissue bias in reproductive organs was modest [5,43,50]. Together, these observations support a scheme in which female tissues are prioritized for high-throughput soluble carrier supply, whereas male antennae are characterized by membrane-side facilitation, jointly tuning long-range chemosensation in antennae and near-field chemosensation in the ovipositor and genitalia.
Differential expression further sharpened these inferences. In female vs. male antennae (FAn vs. MAn), 24 OBPs show significant sex bias (|log_2_FC| ≥ 1, FDR < 0.05): 14 are enriched in Fan, and ten are enriched in MAn (four CSPs are also sexually dimorphic, with three being FAn-biased and one being MAn-biased, and PopeSNMP2 is likewise upregulated in MAn, whereas PopeSNMP1 shows no significant sex bias). In ovipositors vs. male genitalia (FOv vs. MGe), nine OBPs meet the same criteria, including three FOv-biased genes and six MGe-biased genes. Six CSPs are differentially expressed between FOv and MGe, with one being FOv-biased and five being MGe-biased, while no SNMPs are significantly differentially expressed between these reproductive tissues. This division of labor aligns with broad patterns in Lepidoptera: OBPs/CSPs often show sex-biased antennal expression and deploy across sensory and reproductive tissues [59,60,61], whereas a compact FOv enriched module of soluble carriers is present in the female ovipositor (FOv), consistent with near-field appraisals of oviposition cues and with local buffering and transfer of hydrophobic ligands [42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64]. Soluble-protein deployment remains plastic in larval heads and feeding-related tissues, supporting coupling with feeding state and host use [65,66]. By contrast, the membrane co-factor tier is highly conserved: SNMP1/2 occur in both species. SNMP1 is frequently male-biased in antennae and functionally associated with rapid transfer and clearance of pheromonal components, whereas tissue bias in reproductive organs is modest [5,43,44,55,56].
By integrating expression data with phylogenetic analysis, GOBP-like carriers enriched in the ovipositor, such as PopeOBP09, together with the ovipositor-abundant OBPs PopeOBP2, PopeOBP23, and PopeOBP33, are nominated as priority candidates for binding host volatiles and oviposition-medium cues. Within the PBP/GOBP clade, male-biased PBP-like OBPs (PopeOBP05, PopeOBP06, PopeOBP07, PopeOBP22, and PopeOBP34) that are co-expressed with male-antenna-biased PopeSNMP2 are highlighted as potential carriers for pheromone-related hydrophobic compounds, whereas ovipositor-biased CSPs (PopeCSP13, PopeCSP17, PopeCSP21, and PopeCSP15) and male-genitalia-biased CSPs (PopeCSP04, PopeCSP10, PopeCSP18, and PopeCSP26) emerge as additional chemosensory carriers of particular interest. Co-upregulated paralogs within tandem clusters likely exhibit complementary ligand ranges or release kinetics (a birth-and-death outcome of proximal duplication) [41,42,67], while Classic vs. Minus-C/Plus-C subfamily differences map onto cavity architecture and pH-dependent ligand release, suggesting the existence of affinity–release trade-offs testable in vitro [61,68,69].
Comparative analyses of expansions and losses in chemosensory gene families provide a “birth-and-death” evolutionary framework for interpreting niche differentiation and host-use divergence between P. operculella and P. absoluta while also enabling a translation to behavior-based, environmentally compatible control. Our inventory indicates that Pope encodes 47 odorant-binding proteins (OBPs) and 26 chemosensory proteins (CSPs), whereas Pabs encodes 39 OBPs and 24 CSPs, suggesting a larger soluble-carrier repertoire in Pope and potentially a broader molecular basis for capturing and buffering host volatiles and oviposition-substrate cues. Phylogenetic relationships partition these carriers into a conserved core shared by both species and lineage-biased expansion or duplication modules. Many OBP lineages include members from both Pope and Pabs, consistent with a substantial shared core mediating conserved chemosensory recognition and transport. In contrast, several lineages show pronounced within-species multicopy clustering, asymmetric copy-number patterns consistent with post-divergence duplications, or lineage-specific losses or incomplete annotation. Notable patterns include a Pope-biased clade lacking clear Pabs counterparts, where PopeOBP31, PopeOBP32, PopeOBP36, and PopeOBP40 form a closely related group; a strongly asymmetric lineage in which PopeOBP11, PopeOBP12, PopeOBP16, PopeOBP17, and PopeOBP18 cluster as multiple Pope paralogs, whereas only PabsOBP21 appears on the Pabs side; additional within-Pope near-paralog expansions involving PopeOBP29, PopeOBP33, PopeOBP38, and PopeOBP47; and a duplicated lineage retained in both species containing PopeOBP22 and PopeOBP45 together with PabsOBP4 and PabsOBP33. Because duplication can generate paralogous diversity that facilitates amino-acid divergence in ligand-binding pockets, shifts in ligand spectra, and altered binding and release kinetics, these lineage-biased or asymmetrically expanded carriers are more likely than strict one-to-one orthologs to contribute to species-specific host preference and oviposition-site choice. Accordingly, priority candidates for explaining host-use divergence include PopeOBP31, PopeOBP32, PopeOBP36, PopeOBP40, PopeOBP11, PopeOBP12, PopeOBP16, PopeOBP17, PopeOBP18, PopeOBP29, PopeOBP33, PopeOBP38, and PopeOBP47, with synteny and microsynteny analyses needed to distinguish true gains and losses from annotation differences.
Building on this evolutionary prioritization, our tissue- and sex-resolved candidate set provides a practical roadmap for behavior-based control in solanaceous systems. First, ovipositor-enriched and female-antenna-enriched carriers can be advanced through a reverse-chemical-ecology pipeline, including recombinant expression and ligand screening against solanaceous host volatiles and oviposition-medium extracts, to identify high-affinity cues that can be formulated as attractants for monitoring and mass trapping or as oviposition lures to concentrate egg laying. Second, the male-antenna-biased, pheromone-associated module, comprising male antenna-biased OBPs together with PopeSNMP2, offers targets with which to refine pheromone-based mating disruption by identifying binding and transfer partners of pheromone components and screening for competitive inhibitors or antagonists that reduce signal delivery or clearance efficiency. Third, the same prioritized genes provide tractable molecular targets for functional suppression: RNA interference against key carriers or SNMP2 and, when feasible, CRISPR-based perturbation can be used to test whether impairing ligand transport reduces host finding, mating, or oviposition in behavioral assays, thereby informing gene-guided development of semiochemical-based push–pull strategies.
In the context of chemosensory gene expression patterns, some genes exhibit a “largely consistent” pattern, meaning their expression is strongly aligned with their phylogenetic predictions and functional roles. For example, genes such as PopeOBP09 and PopeOBP46, which are expressed predominantly in the ovipositor and in female antennae, align with the expectation that these genes are involved in host-location and oviposition-site selection. In contrast, less consistent expression patterns refer to genes whose expression does not fully align with their predicted phylogenetic or functional roles. For instance, PopeOBP23 and PopeCSP13 show conflicting expression across tissues, suggesting that they may have multifunctional roles or that their regulatory mechanisms might differ depending on the biological context. These discrepancies between expected and observed expression patterns indicate that, in some cases, genes may have evolved additional or altered functions, possibly in response to environmental or ecological pressures.
Together, these candidates convert comparative genomics and expression resources into actionable targets for chemoreception-based pest management, linking copy-number changes to molecular divergence and ultimately behavioral differentiation.
5. Conclusions
By integrating genome-guided annotation with stage- and tissue-resolved transcriptomics, we present a curated catalogue of chemosensory genes in the potato tuber moth, P. operculella, comprising 47 OBPs, 26 CSPs, and two SNMPs, along with a comparative catalogue for P. absoluta (39 OBPs, 24 CSPs, and two SNMPs). OBPs were classified into Classic, Minus-C, and Plus-C based on conserved cysteine motifs; CSPs were found to have N-terminal signal peptides and the four conserved cysteine residues; and SNMPs displayed CD36 family features, including a signal peptide, two transmembrane domains, and a large extracellular loop. Maximum-likelihood phylogenies and chromosomal organization were used to support family assignments, reveal tandem expansions, and establish orthology across species. Expression analysis across developmental stages and adult tissues revealed a layered peripheral chemosensory framework. Antennal OBPs support long-range detection, while a subset of OBPs and CSPs is enriched in the ovipositor in females, suggesting a role in near-field chemical sensing. SNMP1 was more highly expressed in male antennae. These resources provide candidates for biochemical assays, electrophysiological characterization, and genetic perturbations, aiding in the identification of ligand spectra and signaling functions. In practical terms, prioritizing these chemosensory proteins as targets for attractants, repellents, and RNA interference strategies may enhance integrated pest management, particularly for solanaceous crops.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Pelosi P. Zhu J. Knoll W. Beyond chemoreception: Diverse tasks of soluble olfactory proteins Biol. Rev.20189318420010.1111/brv.1233928480618 · doi ↗ · pubmed ↗
- 2Rihani K. Ferveur J.F. Briand L. The 40-year mystery of insect odorant-binding proteins Biomolecules 20211150910.3390/biom 1104050933808208 PMC 8067015 · doi ↗ · pubmed ↗
- 3Jin X. Ha T.S. Smith D.P. SNMP is a signaling component required for pheromone sensitivity in Drosophila Proc. Natl. Acad. Sci. USA 2008105109961100110.1073/pnas.080330910518653762 PMC 2504837 · doi ↗ · pubmed ↗
- 4Forstner M. Gohl T. Gondesen I. Raming K. Breer H. Krieger J. Differential expression of SNMP-1 and SNMP-2 proteins in pheromone-sensitive hairs of moths Chem. Senses 20083329129910.1093/chemse/bjm 08718209018 · doi ↗ · pubmed ↗
- 5Gomez-Diaz C. Bargeton B. Abuin L. Bukar N. Reina J.H. Bartoi T. Graf M. Ong H. Ulbrich M.H. Masson J.F. A CD 36 ectodomain mediates insect pheromone detection via a tunnelling mechanism Nat. Commun.201671186610.1038/ncomms 1186627302750 PMC 4912623 · doi ↗ · pubmed ↗
- 6Hekmat-Scafe D.S. Scafe C.R. Mc Kinney A.J. Tanouye M.A. Genome-wide analysis of the odorant-binding protein gene family in Drosophila melanogaster Genome Res.2002121357136910.1101/gr.23940212213773 PMC 186648 · doi ↗ · pubmed ↗
- 7Xu P.X. Zwiebel L.J. Smith D.P. Identification of a distinct family of genes encoding atypical odorant-binding proteins in Anopheles gambiae Insect Mol. Biol.20031254956010.1046/j.1365-2583.2003.00440.x 14986916 · doi ↗ · pubmed ↗
- 8Zhou J.J. Huang W. Zhang G.A. Pickett J.A. Field L.M. “Plus-C” odorant-binding protein genes in two Drosophila species and Anopheles gambiae Gene 200432711712910.1016/j.gene.2003.11.00714960367 · doi ↗ · pubmed ↗
