Transcriptome Analysis and Identification of Chemosensory Genes in Leguminivora glycinivorella
Jiaqi Shi, Yuxin Zhou, Zhengxiao Du, Ruirui Li, Qi Wang, Yu Gao, Shusen Shi

TL;DR
Researchers identified genes in the soybean pod borer that help it detect chemical signals, which could lead to better pest control methods.
Contribution
The study provides a comprehensive identification and expression analysis of chemosensory genes in Leguminivora glycinivorella.
Findings
183 candidate chemosensory genes were identified, including odorant, gustatory, and ionotropic receptors.
Sex-specific gene expression patterns suggest roles in pheromone detection and host plant interaction.
Absence of certain sensory gene groups may reflect adaptation to soybean feeding.
Abstract
The soybean pod borer is a serious soybean pest because its larvae feed inside pods and damage developing seeds, making conventional insecticide control difficult. To support the development of safer and more effective management strategies, we investigated the genes that may help this insect detect chemical cues from host plants, mates, and its surroundings. By analyzing transcriptome data from multiple adult tissues, we identified 183 candidate genes related to chemical sensing. Many of these genes showed clear differences in expression among tissues and between females and males. In particular, genes associated with odor detection were mainly enriched in the antennae, the primary sensory organs for smelling. Male antennae showed high expression of several candidate genes that may be involved in pheromone detection, whereas some genes were more highly expressed in females and may…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12- —National Key Research and Development Program of China
- —Earmarked Fund for China Agriculture Research System of MOF and MARA
- —project of State Key Laboratory of Green Pesticide, Guizhou Medical University
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurobiology and Insect Physiology Research · Insect-Plant Interactions and Control · Olfactory and Sensory Function Studies
1. Introduction
The soybean pod borer, L. glycinivorella (Matsumura, 1898), a member of the family Olethreutidae (Lepidoptera), is a univoltine, highly host-specific pest that overwinters as mature larvae in the soil [1]. It is widely distributed across China, Japan, and other parts of East Asia, and ranks as one of the most destructive pests of soybean in northeastern and northern China [1]. Larvae bore into pods and feed on developing seeds, causing significant yield losses (typically 10–20%, up to >40% in severe outbreaks) and reducing grain quality and market value; infestation levels have been reported to increase in some regions/years [2]. Because larvae feed inside sealed pods, conventional chemical insecticides cannot effectively reach them, posing major challenges for control [3]. Consequently, research has increasingly shifted toward chemical signaling, and deciphering the molecular mechanisms underlying insect chemoperception has become an important avenue for developing novel green pest management strategies. Recent work has advanced our understanding of L. glycinivorella’s diapause regulation [4], transgenic resistance [5], microbial control [6], and biological control [7,8] and even identified new associated pests like Orosius orientalis (Matsumura, 1914) [9]. However, despite these efforts, the molecular basis of its olfactory system, which is central to host finding, mate location, and oviposition, remains uncharacterized. To date, no comprehensive study has identified or profiled its chemosensory receptor genes, leaving a critical gap in the development of behavior-based management tools.
In recent years, research on insect chemosensory systems has emerged as a focal point at the intersection of chemical ecology, behavioral biology, and sustainable pest control [10,11]. Insects rely on highly specialized chemosensory mechanisms to detect complex environmental cues—such as plant volatile organic compounds (VOCs), sex pheromones, oviposition stimulants, and defensive secondary metabolites [10]. This process is mediated by several conserved gene families: odorant receptors (ORs), gustatory receptors (GRs), ionotropic receptors (IRs), odorant-binding proteins (OBPs), chemosensory proteins (CSPs), and sensory neuron membrane proteins (SNMPs) [10,11,12,13,14].
Specifically, ORs are seven-transmembrane proteins that function as ligand-gated cation channels through heteromeric complex formation with the conserved co-receptor Orco. Ligand binding to plant terpenes or pheromone components triggers channel opening and the rapid activation of olfactory neurons [15,16]. GRs share a similar seven-transmembrane topology but are evolutionarily distinct from ORs—some evidence suggests that ORs may have originated from GRs. GRs primarily detect non-volatile compounds (e.g., sugars and bitter substances) via contact chemosensation and also mediate CO_2_ perception; they are abundantly expressed in mouthparts and are crucial for host recognition and feeding decisions [17]. In Lepidoptera, IRs are classified into three subfamilies: antennal IRs (A-IRs), lepidopteran-specific IRs (LS-IRs), and divergent IRs (D-IRs) [18]. A-IRs, expressed in antennae, detect specific odorants such as acids and amines [19,20,21,22,23] and also contribute to thermosensation and hygrosensation [24,25,26,27]. D-IRs, found in gustatory tissues but not antennae, are implicated in taste reception [28,29,30,31,32], while LS-IR functions remain unclear. Functional IR complexes often require co-receptors IR8a or IR25a. For example, in Bactrocera dorsalis (Hendel, 1912), CRISPR/Cas9 knockout of IR8a impairs acetic acid detection, and calcium imaging shows that IR64a and IR75a (paired with IR76b or IR8a) act as candidate acid receptors, potentially complementing detection channels mediated by ORs/GRs [31,32].
Beyond membrane receptors, soluble carrier proteins—OBPs and CSPs—play essential roles in transporting hydrophobic odorants through aqueous sensillum lymph to receptors. OBPs typically contain six conserved cysteines and exhibit selectivity for odor molecules, accumulating in antennal lymph [33]. Although the functional necessity of OBPs has been questioned in some insect models, they are widely regarded as critically important for olfactory perception in Lepidoptera Odorant Reception in Insects: Functional and Evolutionary Perspectives. CSPs, with only four conserved cysteines, possess more compact, thermostable structures and broader expression—not only in olfactory organs but also in wings, legs, and hemolymph—suggesting roles beyond olfaction, such as in development or immunity [34]. SNMPs, CD36-family transmembrane glycoproteins localized on olfactory neuron membranes, enhance sensitivity to sex pheromones by interacting with pheromone receptors; loss of SNMP1 can severely disrupt mating behavior [35,36].
Notably, these chemosensory genes display pronounced tissue specificity and sexual dimorphism. For instance, in Helicoverpa armigera (Hübner, 1808), HarmSNMP1 is highly expressed in antennae and is critical for pheromone detection, while HarmGR9—abdomen-enriched—regulates satiety and sugar intake [17]. In Spodoptera littoralis (Boisduval, 1833), SlitIR87a is antenna-specific, implicating it in odor sensing [31]; in S. frugiperda, SfruCSP1/2 are highly expressed in larval cuticle and may serve as targets for insecticides like chlorfenapyr [37]. In H. assulta and H. armigera, mouthpart-biased CSP4 facilitates sugar feeding by reducing surface tension [36]. Moreover, in Holotrichia parallela, the pheromone receptor OR14 shows male-biased, circadian-rhythmic expression synchronized with its every-other-day pheromone release behavior [38]. Collectively, these findings reveal that chemosensory genes are deeply integrated into the behavioral regulatory networks governing the insect life cycle [35,39], and their sex- and tissue-biased expression provides key clues for dissecting behaviors like female oviposition preference or male pheromone tracking.
Despite extensive characterization in other major lepidopteran pests, the molecular landscape of the soybean pod borer’s chemosensory system remains unexplored. Current knowledge is limited to the identification of its sex pheromones—primarily (E)-10-dodecen-1-yl acetate and (E,E)-8,10-dodecadien-1-yl acetate—and associated electrophysiological responses (e.g., EAG and GC–EAD) [9]. However, the molecular receptors and binding proteins involved in detecting these pheromones and host volatiles have not been systematically identified, especially those in the GR and IR families.
To address this gap, we performed transcriptome sequencing of multiple adult tissues of L. glycinivorella, including male and female antennae, heads without antennae, thoraces, abdomens, and legs. We comprehensively annotated all six core chemosensory gene families (ORs, GRs, IRs, OBPs, CSPs, and SNMPs), analyzed their phylogenetic relationships with homologs from other Lepidoptera, and integrated tissue-specific expression profiles to prioritize candidate genes potentially involved in (1) sex pheromone perception (e.g., ORs, OBPs, and SNMPs), (2) host volatile detection (e.g., ORs and IRs), and (3) contact-mediated oviposition behavior (e.g., GRs, IRs, and CSPs). This study fills a critical gap in the molecular understanding of L. glycinivorella chemoreception and provides a foundational resource and promising molecular targets for future functional validation and the development of chemosensory-based green control strategies, such as behavioral disruptors, pheromone synergists, or RNA-targeted biopesticides.
2. Materials and Methods
2.1. Insect Source and Tissue Sampling
The L. glycinivorella colony used in this study was maintained in the insectary of the Economic Crop Pest Management and Control Team, Jilin Agricultural University. The colony was originally established from individuals collected at the Xidi experimental field in Changchun, Jilin Province, China (43.81061° N, 125.40368° E). Insects were reared for multiple generations under controlled laboratory conditions at 25 ± 1 °C, 65 ± 5% relative humidity, and a 16 h light:8 h dark photoperiod. Newly emerged adults were supplied with 10% (v/v) honey solution.
To obtain chemosensory-related transcripts from different adult tissues, eight tissue types were dissected from 3-day-old female and male adults, including female antennae (FA), male antennae (MA), female heads without antennae (FH), male heads without antennae (MH), mixed thoraces (T), mixed legs (L), female abdomens (FAb), and male abdomens (MAb). All dissections were performed on ice. Collected tissues were immediately frozen in liquid nitrogen and stored at −80 °C until RNA extraction.
2.2. RNA Extraction and Transcriptome Sequencing
Total RNA was extracted from each tissue using TRIzol reagent (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA). RNA purity and concentration were assessed using a NanoDrop 2000 spectrophotometer (Thermo Scientific, Thermo Fisher Scientific, Waltham, MA, USA), and RNA integrity was verified with an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) or LabChip GX system (Revvity, Hopkinton, MA, USA). Qualified RNA samples (RIN ≥ 7.0) were sent to Biomarker Technologies (Beijing, China) for library construction and sequencing. Stranded cDNA libraries were prepared using oligo (dT) magnetic beads to enrich poly(A)+ mRNA. The mRNA was fragmented, reverse-transcribed into first- and second-strand cDNA, and then subjected to end repair, A-tailing, and adapter ligation. Libraries were size-selected using AMPure XP beads (Beckman Coulter, Brea, CA, USA), followed by PCR amplification. Library quality was assessed using a Qubit 3.0 fluorometer (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA), and libraries with concentrations ≥ 1 ng/μL were retained. Insert size distribution was validated on a Qsep400 system (BiOptic Inc., New Taipei City, Taiwan), and effective library concentration (>2 nM) was determined by qPCR. Qualified libraries were sequenced on an Illumina platform in paired-end 150 bp (PE150) mode.The raw transcriptome sequencing data generated in this study have been deposited in the NCBI Sequence Read Archive (SRA) under BioProject accession number PRJNA1434332. The corresponding BioSample and SRA run accession numbers are provided in Supplementary Table S1.
2.3. Transcriptome Assembly and Functional Annotation
Raw reads were quality-filtered to remove adapter-contaminated and low-quality reads, yielding clean data. Clean reads were aligned to the L. glycinivorella reference genome (Leguminivora_glycinivorella.GCF_023078275.1.genome.fa) using HISAT2 v2.0.4 [40]. Transcriptome reconstruction and expression quantification were performed using StringTie v2.2.1 [41], and gene expression levels were normalized as FPKM. Functional annotation was conducted using the BMKCloud (Biomarker Technologies) standard pipeline, including DIAMOND searches against Nr/Swiss-Prot/KEGG and related databases [42], GO assignment using InterProScan [43], and conserved domain identification using Pfam [44].
2.4. Identification and Structural Prediction of Chemosensory Genes
Six chemosensory gene families—OBP, CSP, OR, IR, GR, and SNMP—were identified from the annotated transcriptome. To improve annotation accuracy, homologous sequences from closely related lepidopterans were used as references, and candidate genes were confirmed by BLASTP searches against the NCBI database. Open reading frames (ORFs) were predicted using (NCBI, https://www.ncbi.nlm.nih.gov/orffinder/, accessed on 13 March 2026). Signal peptides were predicted using SignalP 5.0 (https://services.healthtech.dtu.dk/service.php?SignalP-5.0, accessed on 13 March 2026), and transmembrane domains were predicted using the TOPCONS web server (https://topcons.cbr.su.se/, accessed on 13 March 2026). Molecular weight and isoelectric point were calculated using the ExPASy Compute pI/Mw tool (https://web.expasy.org/compute_pi/, accessed on 13 March 2026). Amino acid sequence alignment and manual editing were performed using DNAMAN 8.0.
2.5. Phylogenetic Analysis
Amino acid sequences of L. glycinivorella chemosensory genes and homologous sequences from representative insect species were aligned using MAFFT v7.52 [45]. Poorly aligned and highly gapped regions were trimmed prior to phylogenetic reconstruction. Maximum-likelihood (ML) phylogenetic trees were constructed using IQ-TREE v1.6.12 [46]. The best-fit amino acid substitution model for each dataset was automatically selected using ModelFinder implemented in IQ-TREE [47], with JTT + F + R7 for ORs, VT + F + G4 for GRs, JTT + F + R8 for IRs, LG + R4 for OBPs, LG + F + R4 for CSPs, and LG + I + G4 for SNMPs. Node support was assessed using 10,000 ultrafast bootstrap replicates [48]. Because these chemosensory gene families are highly divergent and suitable outgroups were not available for reliable rooting, the ML phylogenetic trees were presented and interpreted as unrooted trees.
To further evaluate the robustness of the phylogenetic inference, Bayesian inference (BI) analyses were additionally performed based on CDS datasets. CDS sequences were aligned using MACSE v2.06 [49], and the alignments were manually inspected and trimmed in MEGA 11 [50]. BI trees were reconstructed using MrBayes 3.2.7a [51] implemented in PhyloSuite v1.2.2 [52], and the resulting trees are provided in the Supplementary Materials. Final tree visualization and annotation were performed using iTOL v7.6 [53].
2.6. Differential Expression Analysis
To characterize tissue-specific expression patterns, FPKM values of candidate chemosensory genes were log_2_(FPKM + 1) transformed and normalized. The processed data were uploaded to Biomarker’s online platform for heatmap visualization using default clustering parameters.
Differential expression analysis was conducted on Biomarker’s online platform using the EBSeq package. Genes with |log_2_(fold change)| > 1 and p < 0.05 were considered significantly differentially expressed. Special emphasis was placed on the comparison between FA and MA to identify female antenna-biased chemosensory candidates.
2.7. qRT-PCR Validation
To support transcriptome-derived expression profiling, qRT-PCR was performed for ten representative odorant receptor genes (LglyORco, LglyOR1a, LglyOR11b, LglyOR49b, LglyOR6c, LglyOR6d, LglyOR15a, LglyOR2, LglyOR6b, and LglyOR6a) using an independent batch of male antennae from L. glycinivorella. Total RNA from the independent batch of male antennae was treated with DNase I and reverse-transcribed into cDNA. qRT-PCR was performed using PerfectStart^®^ Green qPCR SuperMix (TransGen Biotech, Beijing, China; AQ601-02) according to the manufacturer’s instructions, and amplification specificity was confirmed by melting-curve analysis. Relative expression levels were calculated using the 2^−ΔΔCt^ method with Actin as the internal reference gene. Data are presented as mean ± SEM from technical replicates (wells, n = 9). Primer sequences are provided in Table S2.
3. Results
3.1. Transcriptome Sequencing and Assembly Quality Assessment
High-throughput transcriptome sequencing was performed on eight tissue samples of L. glycinivorella, including female antennae (FA), male antennae (MA), female heads without antennae (FH), male heads without antennae (MH), pooled thoraces (T), pooled legs (L), female abdomens (FAb), and male abdomens (MAb). In total, 137.99 Gb of clean data were generated, with each sample yielding at least 6.04 Gb. The Q30 percentage exceeded 97.78% across all samples, indicating high sequencing quality.
Clean reads were mapped to the L. glycinivorella reference genome, with mapping rates ranging from 66.83% to 80.41%. Based on these alignments, alternative splicing events were predicted, gene models were refined, and novel genes were identified. In total, 5593 novel genes were discovered, of which 1402 received functional annotations (Table 1).
3.2. Identification of Chemosensory-Related Genes
Based on transcriptome annotation and homology searches, we identified candidate members of the six major chemosensory gene families, including odorant receptors (ORs), gustatory receptors (GRs), ionotropic receptors (IRs), odorant-binding proteins (OBPs), chemosensory proteins (CSPs), and sensory neuron membrane proteins (SNMPs). Among the soluble olfactory proteins, we identified 52 OBPs and 18 CSPs (Supplementary Tables S3 and S4). All candidate sequences were further validated by ORF prediction. Among them, 44 OBPs and 16 CSPs contained full-length open reading frames. Sequence analysis revealed that most canonical OBPs retain the conserved six-cysteine motif (C-pattern), while CSPs possess four conserved cysteine residues, both of which are important for tertiary structure stability and ligand binding. Both OBP and CSP members were predicted to contain an N-terminal signal peptide and to lack transmembrane domains, consistent with their roles as secreted carrier proteins that bind and transport hydrophobic volatile molecules through the aqueous sensillar lymph.
In addition to soluble proteins, we identified four families of membrane-bound chemosensory genes (Supplementary Tables S5–S8): 76 ORs, of which 54 were full-length; 15 GRs, all full-length; 18 IRs, with 7 full-length; and 4 SNMPs, all full-length. ORs and GRs typically exhibited 6–7 predicted transmembrane domains, consistent with seven-transmembrane chemosensory receptors. SNMPs featured two transmembrane domains and a large extracellular loop, which are hallmark features of CD36-family proteins implicated in olfactory signaling. IRs retained conserved domains and key residues homologous to ionotropic glutamate receptors (iGluRs), reflecting structural conservation. Overall, the structural features of L. glycinivorella chemosensory genes closely resemble those reported in other lepidopterans, supporting the reliability of our annotations.
3.3. Phylogenetic Analysis of Chemosensory Genes
To infer evolutionary relationships, maximum-likelihood phylogenetic trees were constructed for all six chemosensory gene families (OR, GR, IR, OBP, CSP, and SNMP) using homologs from representative Lepidoptera. In general, L. glycinivorella candidates clustered within well-supported, conserved clades of lepidopteran orthologs, providing phylogenetic support for functional inference. Bayesian inference (BI) phylogenetic analyses based on CDS datasets yielded topologies generally consistent with the ML trees, further supporting the robustness of the inferred phylogenetic relationships (Supplementary Figures S1–S6).
OR Family (Figure 1): The co-receptor LglyOrco formed a distinct, strongly supported clade with Orco orthologs from other lepidopterans, consistent with its conserved role. Excluding Orco, 13 LglyORs grouped within the lepidopteran pheromone receptor (PR) clade, suggesting potential involvement in sex pheromone detection. The remaining LglyORs were distributed across multiple subfamilies: some showed putative one-to-one orthology with reported host-volatile receptors, whereas others formed species-specific clusters, indicating possible functional diversification in host-odor and environmental-cue recognition. The BI tree showed a generally consistent topology with the ML tree, further supporting the phylogenetic placement of LglyOrco and the candidate PR lineage (Supplementary Figure S1).
GR Family (Figure 2): The phylogenetic tree clarified the correspondence between L. glycinivorella GR members and major functional subfamilies of lepidopteran GRs. Within the putative CO_2_ receptor clade (GR1/GR2/GR3 lineage), LglyGR2 clustered with orthologs from other lepidopterans, whereas no GR1- or GR3-like homologs were recovered from our transcriptome dataset. In the sugar receptor-related clade, LglyGR5 and LglyGR6 were identified. In addition, no L. glycinivorella candidates were assigned to the bitter taste receptor-related clade in the phylogenetic analysis. Bayesian inference analysis recovered a topology largely consistent with the ML tree, supporting the inferred placement of the CO_2_ receptor and sugar receptor-related GR lineages (Supplementary Figure S2).
IR family (Figure 3): Among the 18 identified LglyIR genes, phylogenetic analysis classified them into three subfamilies. The Antennal IRs (A-IRs) include two conserved co-receptors, LglyIR8a and LglyIR25a, as well as a group of tuning receptors—LglyIR21a, LglyIR40a, LglyIR75q.1, and LglyIR87a. Additionally, a L. glycinivorella-specific clade was identified, which clusters with the co-receptors IR25a and IR8a and comprises four paralogs: LglyIR25b, LglyIR25c, LglyIR25d, and LglyIR25e. Within the Lepidoptera-Specific IRs (LS-IRs), three members were identified: LglyIR1.1, LglyIR1.2, and LglyIR2. In the Divergent IRs (D-IRs), only four copies from the IR7d family were detected (LglyIR7d.1.1, LglyIR7d.1.2, LglyIR7d.2.1, and LglyIR7d.4). No L. glycinivorella homologs were found in the IR100a, IR100b, IR143, or IR85a clades. The BI tree was generally congruent with the ML topology and supported the major IR clades identified in this study (Supplementary Figure S3).
OBP Family (Figure 4): Phylogenetic analysis showed that L. glycinivorella OBPs are distributed across the major lepidopteran OBP clades. Notably, several LglyOBPs formed sister pairs or small species-specific clades (e.g., LglyOBP12a/12b and LglyOBP18a/18b), consistent with recent lineage-specific duplications followed by sequence divergence. Bayesian inference analysis yielded a generally consistent topology with the ML tree, further supporting the observed OBP clustering pattern and the inferred lineage-specific duplications (Supplementary Figure S4).
CSP family (Figure 5): L. glycinivorella CSPs co-clustered with CSPs from representative lepidopteran species across multiple clades, indicating an overall conserved evolutionary pattern. A few members formed closely related pairs or species-specific subclades (e.g., LglyCSP7a and LglyCSP7b), suggesting possible local gene duplication or expansion. Such lineage-specific changes may contribute to the functional diversification of CSPs beyond olfaction across multiple tissues. The BI tree showed an overall topology consistent with the ML analysis, supporting both the conserved CSP clades and the inferred species-specific expansions (Supplementary Figure S5).
SNMP Family (Figure 6): SNMP members were clearly divided into two subfamilies, SNMP1 and SNMP2. LglySNMP1 clustered with SNMP1 homologs from other lepidopterans in a single well-supported clade, consistent with the reported association of SNMP1 with sex pheromone detection. In contrast, LglySNMP2a, LglySNMP2b, LglySNMP3c grouped within the SNMP2 clade, reflecting divergence from SNMP1. Given that SNMP1 has been shown to play an essential role in sex pheromone perception in several lepidopteran species, LglySNMP1 represents a high-priority candidate for subsequent functional validation. Bayesian inference analysis also supported the separation of SNMP1 and SNMP2 and the phylogenetic placement of LglySNMP1 within the SNMP1 clade (Supplementary Figure S6).
In summary, phylogenetic analyses indicate that the chemosensory-related genes of L. glycinivorella exhibit typical lepidopteran evolutionary characteristics and show evidence of potential gene duplication and lineage-specific divergence, providing a foundation for subsequent screening of key functional candidates based on expression profiles, electrophysiological assays, or genetic manipulation. The overall congruence between the ML and BI trees further supports the robustness of these phylogenetic inferences.
3.4. Tissue-Specific Expression Analysis of Chemosensory Genes
To characterize the expression distribution of chemosensory-related genes across tissues and sex-biased expression in the antennae of L. glycinivorella, we generated clustered heatmaps based on FPKM values for the six gene families (OR, GR, IR, OBP, CSP, and SNMP) and performed differential expression analysis between female and male antennae; significantly differentially expressed genes were visualized in volcano plots.
Overall, the heatmap results showed that, compared with heads (antennae removed), thoraces, legs, and abdomens, genes from the OR, IR, OBP, and SNMP families were generally more highly expressed in antennae, exhibiting a clear antennal-enriched pattern. In contrast, CSP and GR family genes showed broader expression overall, although some members still exhibited relatively higher expression in antennae or displayed sex-biased expression.
At the family level, the OR family was strongly enriched in antennae, and several ORs showed relatively higher expression in female antennae (e.g., LglyOR13 and LglyOR15). The GR family exhibited a relatively dispersed expression pattern but still included sex-differentially expressed genes. The IR family contained only a small number of differentially expressed genes, and no down-regulated genes were detected. Among soluble carrier proteins, the OBP family was primarily antennal-enriched and included sex-differentially expressed genes (e.g., LglyOBP2 and LglyOBP8), whereas the CSP family was broadly expressed, with only a few genes showing sex-biased expression. Finally, the SNMP family showed overall high expression in antennae and a detectable degree of sex-biased expression.
Multiple OR genes in L. glycinivorella were significantly enriched in male antennae, whereas several genes, including LglyOR13, LglyOR15, LglyOR32a, and LglyOR35a, exhibited tissue-specific high expression in female antennae, heads, or abdomens (Figure 7A). The differential-expression volcano plot (Figure 7B) further indicated sex-biased expression between female and male antennae: 12 OR genes were significantly higher in male antennae (e.g., LglyOR11e, LglyOR2, LglyOR6c, and LglyOR4a), whereas five genes (e.g., LglyOR10, LglyOR32a, and LglyOR35a) showed significantly lower expression in female antennae; most remaining ORs did not exhibit significant changes.
To strengthen the robustness of transcriptome-based expression profiling, we performed qRT-PCR for ten representative OR genes using an independent batch of male antennae (Figure 7C). The qRT-PCR results revealed a clear expression hierarchy among these candidates in male antennae, providing independent experimental support for the transcriptome-derived OR expression profile. ORs were prioritized for validation because OR-mediated olfaction constitutes a core molecular basis of odor detection in insects and is directly relevant to antennal odorant recognition.
Most LglyGR genes showed higher expression in the antennae, particularly in male antennae, such as LglyGR5, LglyGR6, LglyGR2, and LglyGR43. In contrast, LglyGR68.1, LglyGR63, LglyGR60, LglyGR53, LglyGR8, LglyGR14, LglyGR10, LglyGR68.2, LglyGR68.3, and LglyGR9 exhibit higher expression in thorax, legs, or female tissues (Figure 8A). Some of these genes are expressed more highly in female antennae or abdomen, indicated in blue, suggesting they may participate in gender-specific or non-olfactory-related sensory functions. Further analysis using a volcano plot examined the differential expression of L. glycinivorella GR genes between male and female antennae (Figure 8B). It was found that LglyGR53 and LglyGR68.1 were significantly upregulated in male antennae, while LglyGR14 and LglyGR43 were significantly downregulated in female antennae. The remaining genes did not show significant differential expression.
Among the antennal IRs (A-IRs), the co-receptor LglyIR8a was highly expressed in both male and female antennae and showed detectable expression in heads, abdomens, and legs, whereas LglyIR25a exhibited extremely low expression (Figure 9A,B). Its paralogs LglyIR25b–e displayed divergent expression patterns: LglyIR25c was highly expressed in male antennae; LglyIR25b was enriched in thoraces and legs; LglyIR25d was primarily expressed in male abdomens; and LglyIR25e showed the highest expression in legs. Among the tuning receptors, LglyIR75q.1 showed the highest expression in male antennae, and LglyIR40a was also enriched in male antennae, whereas LglyIR21a and LglyIR87a were expressed at low levels in antennae; Within the lepidopteran-specific IRs (LS-IRs), LglyIR2 showed the highest expression in male antennae, followed by LglyIR1.2, while LglyIR1.1 was expressed at a low level in male antennae; all three showed very low or undetectable expression in other tissues. In the divergent IRs (D-IRs), LglyIR7d.2.1 displayed the highest expression in male antennae, and LglyIR7d.1.1, LglyIR7d.1.2, and LglyIR7d.4 also showed male antennae as their primary expression site, with minimal or no expression in other tissues. Consistent with the phylogenetic analysis, no IR100a-, IR100b-, IR143-, or IR85a-like homologs were recovered from our transcriptome dataset, and no corresponding expression signals were observed in the heatmap. Differential expression analysis between female and male antennae further indicated that LglyIR25b and LglyIR25d were significantly upregulated in male antennae, with LglyIR25b showing the strongest statistical support; no other IR genes showed significant sex-biased expression in antennae.
The heatmap showed that most LglyOBP genes were highly expressed in antennae, with pronounced enrichment in both male and female antennae (Figure 10A). Notably, several OBPs (e.g., LglyOBP33, LglyOBP4, and LglyOBP23) exhibited higher expression in male antennae. Volcano-plot analysis further indicated that multiple OBP genes were significantly differentially expressed between antennae and non-antennal tissues, with LglyOBP33, LglyOBP4, and LglyOBP23 significantly upregulated and LglyOBP19 and LglyOBP12a significantly downregulated (Figure 10B).
The heatmap showed that most LglyCSP genes were highly expressed in the antennae of L. glycinivorella, with clear sex-biased patterns. LglyCSP1, LglyCSP2, LglyCSP8, LglyCSP11, and LglyCSP14 exhibited higher expression in male antennae, whereas LglyCSP15 and LglyCSP17 showed higher expression in female antennae (Figure 11A). In addition, some genes (e.g., LglyCSP2 and LglyCSP3) displayed relatively high expression across multiple tissues. Volcano-plot analysis further indicated that LglyCSP1 and LglyCSP22 were significantly upregulated in male antennae, whereas LglyCSP17 was significantly upregulated in female antennae (Figure 11B).
LglySNMP1 exhibited significantly high expression in both male and female antennae, while its expression was markedly lower in other tissues such as the abdomen, head, and thorax. LglySNMP2a and LglySNMP2b also showed relatively high expression in antennae and head, but their expression was broader compared to LglySNMP1, extending to legs and thorax (Figure 12A). The expression pattern of LglySNMP3c resembled that of LglySNMP1, though it was nearly absent in the abdomen and thorax. Volcano plot analysis of specific comparison groups (e.g., antennae vs. other tissues) further revealed that only LglySNMP1 was significantly upregulated in this contrast, whereas the other three genes did not show significant differential expression (Figure 12B).
4. Discussion
A thorough dissection of the chemosensory system in L. glycinivorella provides molecular targets and a theoretical basis for behavior-based green pest management strategies within integrated pest management (IPM), such as pheromone trapping and the screening of host-volatile attractants or repellents. Chemosensory molecules—including ORs, GRs, IRs, and peripheral auxiliary proteins such as OBPs—represent key entry points for “reverse chemical ecology” approaches aimed at discovering semiochemicals and optimizing behavioral control tactics [73]. However, the molecular mechanisms underlying chemical perception in L. glycinivorella remain poorly understood. Here, we systematically identified chemosensory-related genes through multi-tissue transcriptome sequencing and bioinformatic analyses and characterized their sequence features, phylogenetic relationships, and tissue-specific expression patterns, thereby providing a resource for subsequent functional validation and ligand screening of candidate genes [74].
ORs are core receptors in insect olfaction and typically form heteromeric complexes with the conserved co-receptor ORco to mediate signal transduction. In this study, we identified LglyORco, providing a basis for subsequent heterologous expression and ligand screening. In the silkworm (Bombyx mori), loss of ORco significantly reduces olfactory sensitivity and impairs related behaviors, underscoring the functional conservation and essentiality of Orco [75]. In Lepidoptera, pheromone receptors (PRs) represent a specialized clade within the OR family and mediate mate recognition. For instance, BmOR1 in B. mori has been functionally validated to detect a sex pheromone component [76], and PR lineages have undergone species-specific expansion and functional diversification across moth species [77]. Here, we identified 13 candidate PRs in L. glycinivorella, indicating a potentially diverse receptor repertoire for sex pheromone detection; however, functional assignment will require validation using sex- and tissue-specific expression patterns and ligand-response profiles from heterologous expression assays. A limitation of this study is the lack of paired qRT-PCR validation for FA–MA DEGs and the absence of functional assays for candidate ORs. Therefore, the PR candidates and female-biased ORs proposed here should be regarded as hypothesis-driven priorities for subsequent experimental validation. In contrast, several ORs in our dataset (e.g., LglyOR13 and LglyOR15) were upregulated in female antennae, more plausibly indicating roles in host location or oviposition-site selection. Notably, phylogenetic placement alone does not strictly predict ligand type: for example, CpomOR3 in codling moth (Cydia pomonella), initially annotated as PR-like, was shown to respond to the host volatile pear ester [78]; similarly, HassOR31 in Helicoverpa assulta (Guenée, 1852) is highly expressed in the ovipositor and is implicated in host-volatile detection [79]. Therefore, as working hypotheses, candidate ORs can be prioritized into two functional categories: (1) PR candidates likely associated with sex pheromone communication, and (2) female-biased ORs potentially involved in recognizing host- or oviposition-related chemical cues. This framework provides biologically plausible priorities for future functional validation.
GRs primarily mediate insect perception of contact chemical stimuli—such as sugars and bitter compounds—as well as gaseous cues such as CO_2_, providing a molecular basis for feeding selection, host recognition, and oviposition decisions [63]. In this study, we identified 15 candidate LglyGR genes from multi-tissue transcriptomes, including putative GR43a-like (“fructose receptor”) members, candidate CO_2_ receptor-related genes, and sugar receptor-related members, suggesting that L. glycinivorella retains molecular components for detecting nutritional signals and CO_2_-related cues [63]. Functional evidence from other insects supports these annotations. In Drosophila, Gr43a has been confirmed as a highly specific fructose receptor and internal nutrient sensor that modulates feeding in response to physiological fructose levels [80]. Similar functions have been reported in Lepidoptera: Helicoverpa armigera HaGr4 responds specifically to fructose [81], while HaGr9 (reported as identical to or closely related to HaGr4) also responds to D-maltose and D-galactose in addition to fructose [82]. Likewise, Ostrinia furnacalis OfurGr43 has been shown to detect both D-fructose and D-sorbitol [61]. In L. glycinivorella, LglyGR1, LglyGR8, LglyGR9, LglyGR10, and LglyGR43 were assigned to the GR43a-related clade, suggesting that this pest may exhibit sensitivity to fructose-associated cues.
Notably, we did not recover homologs assigned to the canonical bitter GR lineage from our dataset [83], which may reflect its specialized oligophagous lifestyle. Regarding CO_2_ perception, lepidopterans typically retain a conserved CO_2_ receptor lineage composed of three subunits (GR1, GR2, and GR3), which are predominantly expressed in labial palp sensilla and mediate behavioral responses to CO_2_ [63]. For example, in H. armigera, co-expression of HarmGr1 with HarmGr3 (or HarmGr2) in heterologous systems elicits robust electrophysiological responses to bicarbonate/CO_2_, and all three receptors are co-expressed in the same labial palp neuron, supporting a “GR complex-mediated CO_2_ detection” model [84]. Similarly, in the fall webworm Hyphantria cunea, a binary system expressing HcunGR1 and HcunGR3 produces concentration-dependent responses to high CO_2_, while a ternary combination (HcunGR1/2/3) is activated by CO_2_-containing solutions [85]. In L. glycinivorella, however, we identified only LglyGR2 and did not recover GR1- or GR3-like homologs. This pattern may reflect one or more of the following: (1) labial palps were not separately sequenced, resulting in failure to recover low-abundance GR1/GR3 transcripts; (2) CO_2_ detection may rely on GR2 alone or GR2 in combination with non-canonical subunits; or (3) CO_2_-sensing capability may have undergone functional reduction. Accordingly, LglyGR2 should be interpreted cautiously as a potential component of a CO_2_-sensing module. Future work should prioritize verifying its expression enrichment in key organs such as labial palps and conducting functional assays using CO_2_ or bicarbonate as primary candidate stimuli [84].
In this study, phylogenetic analysis of the 18 identified LglyIR genes in L. glycinivorella assigned them to the three IR subfamilies defined by Yin et al. (2021): antennal IRs (A-IRs), lepidopteran-specific IRs (LS-IRs), and divergent IRs (D-IRs) [18]. Within the A-IRs, LglyIR8a and LglyIR25a clustered with conserved co-receptors, whereas LglyIR21a, LglyIR40a, LglyIR75q.1, and LglyIR87a fell into their respective tuning receptor clades [25]. Notably, we recovered a L. glycinivorella-specific clade comprising LglyIR25b–e that grouped with IR25a, suggesting lineage-specific duplication of IR25 in this species. Expression analysis further indicated divergence within this clade: LglyIR25c was highly expressed in male antennae; LglyIR25b was enriched in thoraces and legs; LglyIR25d was primarily expressed in male abdomens; and LglyIR25e showed the highest expression in legs. Differential expression analysis further showed that LglyIR25b and LglyIR25d were significantly upregulated in male antennae, suggesting potential roles in male-biased chemosensory processes. Several tuning receptors exhibited high expression in male antennae, including LglyIR75q.1, LglyIR2, and LglyIR40a, suggesting potential involvement in male-biased behaviors such as mate seeking and host-related odor detection. Within the LS-IRs, in addition to LglyIR2, both LglyIR1.1 and LglyIR1.2 were predominantly expressed in male antennae and showed minimal or no expression in other tissues, consistent with lepidopteran-specific olfactory tuning receptors. In the D-IRs, only four members of the IR7d family (LglyIR7d.1.1, LglyIR7d.1.2, LglyIR7d.2.1, and LglyIR7d.4) were recovered, and all were primarily expressed in male antennae. Typical D-IR lineages such as IR100a, IR100b, IR143, and IR85a were not recovered from our dataset, and no corresponding expression signals were observed in the heatmap. Given that D-IRs in insects such as Drosophila are mainly implicated in gustation and perception of non-volatile compounds [28,29,30], this pattern may be associated with extreme host specialization in L. glycinivorella. Functionally, IRs often operate as heteromeric complexes composed of a co-receptor and a tuning receptor to detect acidic volatiles. For example, the Drosophila IR64a/IR8a complex responds to acidic odors [86]; in mosquitoes, IR8a-mediated pathways contribute to the detection of human-derived acidic compounds such as lactic acid [87]; and in H. armigera, IR8a is required for acetic acid attraction [88]. In summary, although the IR repertoire of L. glycinivorella is smaller than that of many other lepidopterans [18], its members exhibit pronounced tissue specificity and sex-biased expression, particularly the enrichment of multiple tuning receptors in male antennae, while retaining candidates potentially involved in acid and microenvironmental sensing. These features may reflect adaptation to the chemosensory ecology of its soybean host.
We identified 52 LglyOBP genes, including canonical forms and atypical subtypes (Minus-C, Dimer-C). The reduced cysteine motifs in Minus-C OBPs likely alter ligand-binding pocket flexibility, facilitating functional diversification [89,90]. Expression profiling revealed that LglyOBP2 and LglyOBP8 are antenna-enriched, with LglyOBP8 showing significant female bias. This pattern suggests a specific role in detecting host volatiles or oviposition cues, consistent with female-biased OBPs in other lepidopterans [91]. However, as expression patterns are correlative, future ligand-binding assays and behavioral tests are required to validate their specific ligand spectra and rule out functional redundancy [92].
We observed broad tissue distribution for LglyCSP members, supporting their dual roles in both chemosensation and non-olfactory physiological processes (e.g., development, immunity, and xenobiotic response) [93]. While some CSPs in other species mediate olfactory signaling or dietary adaptation [94,95], the ubiquitous expression of LglyCSPs implies a broader functional divergence beyond odorant transport. Additionally, although CSPs have been linked to insecticide resistance in other insects [96], their specific contributions to L. glycinivorella’s environmental adaptation and stress response remain to be elucidated through direct functional validation.
SNMPs belong to the CD36-related family of transmembrane proteins and are typically classified into two subfamilies in the insect peripheral olfactory system: SNMP1 and SNMP2 [35]. In this study, we identified one LglySNMP1 and three LglySNMP2 genes in L. glycinivorella. Phylogenetic analysis showed that LglySNMP1 clustered with lepidopteran SNMP1 homologs that have been implicated in sex pheromone detection. Consistently, LglySNMP1 showed higher expression in male antennae, in line with the reported role of SNMP1 in pheromone-sensitive olfactory neurons and pheromone signal transduction [35,72]. For example, in Helicoverpa armigera, CRISPR-mediated knockout of SNMP1 markedly reduced male electrophysiological responses to long-chain sex pheromone components and impaired upwind flight orientation and behavioral responses to calling females, supporting an essential role of SNMP1 in sex pheromone detection [72]. Therefore, LglySNMP1 represents a high-priority candidate for functional validation of sex pheromone signaling in L. glycinivorella.
In contrast, the three LglySNMP2 genes identified here exhibited broader expression across multiple tissues, suggesting that their functions may extend beyond direct involvement in sex pheromone recognition [72]. In various moth species, SNMP2 is mainly localized in support cells of olfactory sensilla and has been proposed to participate in transport or clearance of lipophilic compounds, including pheromone degradation products, from the sensillar lymph, thereby contributing to microenvironmental homeostasis and signal termination. Experimental evidence from Heliothis virescens and Bombyx mori supports a role of SNMP2 in uptake of long-chain fatty acids (pheromone catabolites) and sensillar “lymph clearance” processes [97,98]. Accordingly, LglySNMP2 members are more likely to function in supportive roles within the peripheral olfactory system, such as maintaining sensillar microenvironment stability and mediating clearance mechanisms. Future studies could combine cellular localization (e.g., in situ hybridization), RNAi or CRISPR-based knockdown/knockout, and integrated electrophysiological and behavioral assays to further dissect the distinct roles of LglySNMP1 and LglySNMP2 [72,97].
5. Conclusions
In summary, this study systematically identified the chemosensory gene repertoire of L. glycinivorella based on multi-tissue transcriptome sequencing, including 76 ORs, 15 GRs, 18 IRs, 52 OBPs, 18 CSPs, and 4 SNMPs. We comprehensively characterized their sequence features, phylogenetic relationships, and tissue-specific expression profiles. The results indicate that, while retaining core olfactory and gustatory pathways, this monophagous pest exhibits marked functional specialization and contraction of several chemosensory gene families relative to polyphagous lepidopterans, and multiple canonical lineages—such as bitter GRs and several divergent IR lineages (e.g., IR100/IR85a)—were not recovered from our dataset. This streamlining may reflect adaptation to the chemical ecology of its sole host plant, soybean. Concurrently, key receptors and auxiliary proteins displayed pronounced differentiation across sexes and tissues: male antennae were enriched for numerous ORs, IRs, and LglySNMP1, suggesting important roles in sex pheromone communication and host-volatile perception; in contrast, female-biased or abdomen-enriched expression of genes such as LglyOBP2, LglyOBP8, LglyOR13 and LglyOR15 points to potential involvement in host location and oviposition decisions. This study further highlights functionally informative candidates: male-biased LglySNMP1 and candidate PRs represent priorities for validation in pheromone- or host volatile-based behavioral manipulation strategies, whereas female- or abdomen-enriched genes offer entry points for disrupting oviposition behavior. In addition, the presence of a GR43a-related clade supports the capacity for detecting nutritional cues such as fructose, whereas the incomplete CO_2_ receptor module in our dataset (only LglyGR2 recovered) suggests potential divergence in CO_2_-related sensing. Overall, delineating the composition and expression landscape of chemosensory genes in L. glycinivorella provides genetic resources and a conceptual basis for green pest control strategies grounded in “reverse chemical ecology.” Future efforts should prioritize ligand deorphanization, neural localization, and behavioral validation to translate these molecular candidates into field applications.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Yang M. Wang Y. Dai P. Feng D. Hughes A.C. Li H. Zhang A. Sympatric diversity pattern driven by the secondary contact of two deeply divergent lineages of the soybean pod borer Leguminivora glycinivorella Integr. Zool.2025201149116810.1111/1749-4877.1291739460509 · doi ↗ · pubmed ↗
- 2Cui J. Qiao F. Qin B. Xu J. Zhao L. Shi S. Oviposition selectivity and larval fitness of soybean pod borer, Leguminivora glycinivorella (Lepidoptera: Olethreutidae), to different soybean varieties Chin. J. Oil Crop Sci.20234559259910.19802/j.issn.1007-9084.2022088 · doi ↗
- 3Hu D. Yang X. Feng J. Zhang X. Advances in the research and application of sex pheromone of soybean pod borer, Leguminivora glycinivorella (Matsumura)Chin. J. Pestic. Sci.20141623524410.3969/j.issn.1008-7303.2014.03.01 · doi ↗
- 4Yang M. Wang Z. Wang R. Zhang X. Li M. Xin J. Qin Y. Zhang C. Meng F. Transcriptomic and proteomic analyses of the mechanisms of overwintering diapause in soybean pod borer (Leguminivora glycinivorella)Pest Manag. Sci.2020764248425710.1002/ps.598932633047 · doi ↗ · pubmed ↗
- 5Fang Q. Cao Y. Oo T.H. Zhang C. Yang M. Tang Y. Wang M. Zhang W. Zhang L. Zheng Y. Overexpression of cry 1c enhances resistance against soybean pod borer (Leguminivora glycinivorella) in soybean Plants 20241363010.3390/plants 1305063038475476 PMC 10933787 · doi ↗ · pubmed ↗
- 6Fei H. Cui J. Zhu S. Xia Y. Xing Y. Gao Y. Shi S. Integrative analyses of transcriptomics and metabolomics in immune response of Leguminivora glycinivorella Mats to Beauveria bassiana infection Insects 20241512610.3390/insects 1502012638392545 PMC 10889468 · doi ↗ · pubmed ↗
- 7Xue J.-Z. Tariq T. Shen Z. Zhang Y.-H. Tang L.-D. Luo R.-B. Sun Y. Hu C.-C. Zang L.-S. Eri silkworm eggs as a superior factitious host for mass rearing Trichogramma leucaniae, the key natural enemy of soybean pod borer Biol. Control 202520910586010.1016/j.biocontrol.2025.105860 · doi ↗
- 8Yoshimura H. Tabuchi K. Konishi K. Ecological aspect of the larval parasitoid Diadegma hiraii (Hymenoptera: Ichneumonidae) as a potential biological control agent of soybean pod borer Leguminivora glycinivorella (Lepidoptera: Tortricidae)Environ. Entomol.20235230931610.1093/ee/nvad 03837058396 · doi ↗ · pubmed ↗
