Transcriptional and functional characterization of terpene synthase genes of the aromatic plant Plectranthus hadiensis
Hinano Mizuno, Kenro Tokuhiro, Koichiro Iwai, Yoshikazu Furuta, Akari Nakasone, Hidenori Tanaka, Hiroki Sugimoto

TL;DR
This study identifies and characterizes 26 terpene synthase genes in Plectranthus hadiensis, revealing their roles in terpene production and highlighting a key gene for limonene synthesis.
Contribution
The first functional characterization of a limonene synthase gene in Plectranthus hadiensis.
Findings
26 terpene synthase genes were identified and classified into five subfamilies.
PhTPS1 was confirmed as a functional limonene synthase through yeast expression and GC-MS analysis.
PhTPS1 shows high expression in leaf and stem tissues, suggesting tissue-specific regulation.
Abstract
Plectranthus hadiensis (Lamiaceae) is recognized for its rich terpene content and potential applications in agriculture, medicine, and aromatherapy. Terpenes are major constituents of P. hadiensis essential oil, yet its terpene synthase (TPS) genes remain insufficiently characterized. In this study, we assembled a de novo transcriptome from RNA-seq data generated from leaf, stem, and root tissues and identified 26 TPS genes. Phylogenetic analysis classifies these genes into five TPS subfamilies (TPS-a, TPS-b, TPS-c, TPS-e/f, and TPS-g), broadly associated with sesquiterpene, monoterpene, and diterpene biosynthesis. Expression profiling revealed apparent tissue specificity; notably, PhTPS1 showed high transcript abundance in the leaf and stem. BLASTP analysis indicated that PhTPS1 is closely related to Lamiaceae monoterpene synthases, with the top hit being a rosemary (Salvia rosmarinus)…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig 1
Fig 2
Fig 3
Fig 4- —Toyota Central R&D Labs., Inc.
- —Toyota Central R&D Labs., Inc.
- —Toyota Central R&D Labs., Inc.
- —Toyota Central R&D Labs., Inc.
- —Toyota Central R&D Labs., Inc.
- —Toyota Central R&D Labs., Inc.
- —Toyota Central R&D Labs., Inc.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPlant biochemistry and biosynthesis · Plant Gene Expression Analysis · Plant-Derived Bioactive Compounds
Introduction
Plants synthesize diverse terpenes that contribute to various ecological functions—such as attracting pollinating insects, repelling predators, and communicating with other plants [1]. Terpenes are hydrocarbons composed of isoprene (C5) units [2] and include monoterpenes (C10), sesquiterpenes (C15), and diterpenes (C20), many of which are major constituents of essential oils [1,3,4]. Representative monoterpenes such as limonene, pinene, and camphene contribute to aroma and bioactivity [5], whereas sesquiterpenes and diterpenes include diverse metabolites involved in plant defense and development [6–8].
Monoterpenes, sesquiterpenes, and diterpenes are generally synthesized from geranyl diphosphate (GPP), farnesyl diphosphate (FPP), and geranylgeranyl diphosphate (GGPP), respectively (Fig 1) [9]. Terpene synthases (TPS) catalyze the conversion of these prenyl diphosphates into structurally diverse terpene scaffolds. Plant TPS typically contain two conserved Pfam domains (PF01397 and PF03936) [10,11] and are classified into multiple subfamilies (TPS-a, TPS-b, TPS-c, TPS-d, TPS-e/f, TPS-g, and TPS-h) that broadly correspond to substrate and product preferences [12,13]. For example, TPS-a is enriched in sesquiterpene synthases, TPS-b contains many monoterpene synthases, and TPS-c and TPS-e/f include diterpene cyclases and synthases, respectively [13]. However, these relationships are not absolute, as TPS enzymes can show substantial functional diversification and multiproduct catalysis [13–17].
Overview of the terpene biosynthesis pathway (mevalonate pathway).IPP: isopentenyl pyrophosphate (C_5), DMAPP: dimethylallyl pyrophosphate (C_5), HMG-R: HMG-CoA reductase, FPS: FPP synthase, GGPS: GGPP synthase. HMG1, ERG20, and BTS1 represent the yeast (Saccharomyces cerevisiae) genes encoding the respective synthases.
Members of Lamiaceae (>250 genera and >7,000 species) synthesize a wide spectrum of terpenes via numerous TPS [18–23]. Research on Lamiaceae TPS has largely used targeted gene- and pathway-level approaches. For example, in Thymus caespititius, cloning and expression analyses identified two monoterpene synthases, a γ-terpinene synthase and an α-terpineol synthase, and defined their exon–intron structures [19, 23]. In Mentha longifolia, a genome-wide survey revealed 63 TPS genes across six subfamilies and functional assays verified that at least one TPS catalyzes the conversion of GPP to limonene [20]. In Origanum vulgare, seven TPS were isolated; the enzyme expression levels correlated with essential-oil composition, and thymol biosynthesis was found to proceed via γ-terpinene produced by a single monoterpene synthase [22]. Nevertheless, most studies still focus on subsets of enzymes or lineages rather than delivering species-level, tissue-resolved TPS inventories with functional validation [19–23]. Plectranthus, a major Lamiaceae genus with >300 species, is a promising source of essential oils and terpenes [24]. Despite extensive ethnobotanical use and demonstrated bioactivity [25], the high intra-generic diversity of Plectranthus complicates taxonomic resolution and standardization for product development [26]. This is especially important for medicinal applications, where complex plant extracts require careful evaluation of potential side effects [27]. Moreover, functional TPS studies are limited to Plectranthus. In P. amboinicus, enzymes related to linalool and nerolidol syntheses have been functionally identified, showing only ~60%–70% amino acid identity to TPS reported from other Lamiaceae genera [21]. In a recent study on P. barbatus, genome assembly enabled comparative analysis of diterpene-related TPS candidates across Lamiaceae members; however, the study did not focus on comprehensive functional characterization of Plectranthus TPS [28]. Collectively, these gaps underscore the need for a systematic, tissue-resolved catalog of the TPS repertoire across the genus Plectranthus—extending beyond the two species studied to date—together with functional validation. Plectranthus hadiensis essential oils reportedly exhibit repellent, antibacterial, antioxidant, pro-apoptotic, membrane-stabilizing/anti-platelet, and anti-inflammatory activities [29–33]. Despite its potential utility, P. hadiensis remains relatively understudied owing to difficulties in distinguishing it from other congeneric species [26] and is often substituted by P. barbatus or P. amboinicus [32,34]. Although some genetic studies on P. hadiensis have recently been conducted [26,35], comprehensive genetic information, including TPS gene family data, on this species remains limited.
To address the lack of a tissue-resolved TPS catalog and functional evidence in P. hadiensis, we set two objectives: (i) to assemble a de novo, tissue-resolved transcriptome and delineate the TPS complement by subfamily assignment and expression profiling across the leaf, stem, and root; and (ii) to test whether the most highly expressed TPS candidate encodes a limonene synthase using heterologous yeast expression and headspace gas chromatography-mass spectrometry (GC–MS). An overview of the experimental workflow is shown in S1 Fig. This design enabled, to our knowledge, the first functional validation of a limonene synthase (PhTPS1) from P. hadiensis and provides a curated TPS resource for this species.
Materials and methods
Sampling and RNA extraction
Plectranthus hadiensis was cultivated for 3 months using a hydroponic system that incorporated hydro balls and was maintained at room temperature (18–27 °C). The plants were grown under long-day conditions (16 h light/8 h dark, approximately 10–15 μmol m ⁻ ² s ⁻ ¹) to mirror indoor cultivation conditions to achieve uniform growth and obtain high-quality RNA. The characteristic aroma was retained. The leaf (with the leaf blades more than 1 cm in length), stem, and root were sampled (S2 Fig). Total RNA was extracted from each sample using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions.
Sequencing and de novo assembly
Complementary DNA (cDNA) library construction and sequencing were outsourced to Nippon Genetics (Tokyo, Japan), and 151 bp paired-end reads were obtained using the NovaSeq6000 system (Illumina, San Diego, CA, USA). Low-quality reads were filtered and adapter sequences were trimmed using fastp v0.23.2 [36] and PRINSEQ v0.20.4 [37]. Quality assessment using fastp indicated that, on average, 93% of bases across all samples had Phred quality scores ≥ Q30. Next, randomly selected paired-end reads (1 M) from individual leaf, stem, and root samples were combined and subjected to de novo assembly using Trinity v2.15.0 [38,39] to generate the reference cDNA sequences of P. hadiensis. Gene prediction was conducted using TransDecoder v5.7.0 [40]. Predicted coding sequences were subjected to functional annotation using InterProScan v5.60-92.0 [41]. Additionally, transcripts per million (TPM) values were calculated using Salmon [42] within the Trinity pipeline, which employs a quasi-mapping approach and normalizes for both transcript length and sequencing depth. We calculated standard assembly statistics (number of contigs, total bases, mean and median contig length, and N50) and assessed completeness with BUSCO v5 [43,44].
To verify the species identity of the plant material, we queried the assembled proteome with the P. hadiensis RbcL protein (YP_010593282.1) using BLASTP v2.9.0+ [45,46], retrieved the best RbcL-like hit, and confirmed identity via pairwise alignment. This barcode check was used solely for sample authentication and not for TPS gene identification.
Quantification of gene expression and identification of differentially expressed genes
Reads from a total of eight samples—three leaf samples, two stem samples, and three root samples—were mapped to the reference sequences using HISAT2 v2.2.1 [47]. Gene expression levels for each sample were quantified using StringTie v1.3.4 [48], and TCC-GUI v2021.11.13 [49] was employed to identify differentially expressed genes (DEGs). Genes with absolute M value (the difference in log_2_-based expression between tissues) equal to or higher than 2.5 and with A value (the average of the log_2_-based expression between tissues) higher than 0 were selected as DEGs. Gene Ontology (GO) term enrichment analysis was performed using topGO v2.50.0 [50].
Identification and classification of TPS genes
Genes encoding amino acid sequences that contain both the N-terminal domain (InterPro; IPR001906, Pfam; PF01397) and metal-binding domain (InterPro; IPR005630, Pfam; PF03936) were extracted as TPS genes. A heatmap comparing expression levels among tissues was generated through R v4.2.2 [51] using the pheatmap v1.0.12 [52] package.
Phylogenetic analysis was conducted using amino acid sequences of TPS genes from P. hadiensis together with reference TPS sequences from Arabidopsis thaliana, representative Lamiaceae species, and additional plant lineages, as needed to cover all major TPS subfamilies defined in previous studies [12,53] (listed in S1 Table). Sequences were aligned with MAFFT v7.453 [54], trimmed using trimAl v1.5.rev1 [55], and a maximum-likelihood tree was inferred with IQ-TREE v3.0.1 [56] with the best-fit substitution model selected by ModelFinder [57]. Branch support was evaluated using the ultrafast bootstrap approximation (UFBoot2) [58] with 1,000 replicates. TPS subfamily assignments for P. hadiensis sequences were determined based on their nearest phylogenetic placement relative to reference TPS sequences in the resulting tree. The final tree was visualized and annotated using Interactive Tree of Life (iTOL) v7 [59] with UFBoot support values.
Molecular cloning of the P. hadiensis TPS gene (PhTPS1)
One TPS gene that exhibited significantly high expression in the leaf and stem was selected for heterologous expression in yeast to assess its activity. This gene is henceforth referred to as PhTPS1. Considering the possibility of allelic variants, RNA sequencing (RNA-Seq) reads were remapped to PhTPS1 and visualized using IGV v2.3.8 [60] and Jalview v2 [61], thereby allowing for manual inspection of the aligned reads.
A yeast codon-optimized version of the PhTPS1 sequence, determined after remapping, was artificially synthesized. Similarly, a lemon (Citrus limon)-extracted TPS (GenBank: AAM53944.1) involved in limonene synthesis, which is active when expressed in yeast [62], was also synthesized (hereinafter referred to as ClTPS) as a positive control. Gene synthesis was performed through the GeneArt Gene Synthesis service of Thermo Fisher Scientific (Waltham, MA, USA).
The synthesized TPS sequences were polymerase chain reaction (PCR)-amplified (PhTPS1: S2 Table primers 1 and 2, ClTPS: S2 Table primers 3 and 4) and introduced into a pRS436GAP plasmid [63] (S3 Table No.5) linearized with SalI (Takara Bio, Kusatsu, Japan), using the In-Fusion Dry-Down PCR Cloning Kit w/ Cloning Enhancer (Takara Bio). The plasmids containing PhTPS1 and ClTPS were designated pRS436GAP-PhTPS1 (S3 Table No.6) and pRS436GAP-ClTPS (S3 Table No.7), respectively.
Sequence confirmation of expression plasmids was performed as follows. For pRS436GAP-PhTPS1 and pRS436GAP-ClTPS, the complete coding regions and vector–insert junctions were verified using Sanger sequencing. Sequencing was performed by Eurofins Genomics K.K. (Tokyo, Japan). The verified plasmids were used in all subsequent yeast experiments.
Construction of plant TPS-expressing yeast strains
We used pRS504HMG1/YPH499 [64] (S3 Table No.9) as the host strain. This strain overexpresses HMG1 encoding the rate-limiting enzyme hydroxymethylglutaryl-CoA reductase in the mevalonate pathway. It was transformed with various plasmids to enhance the production of different terpene precursors—GPP, FPP, and GGPP—consequently generating substrate-producing yeast lines. Specifically, the FPP-producing strain was generated by introducing pRS435GAP-ERG20 [65] (S3 Table No.3), which expresses ERG20 (farnesyl diphosphate synthase)—an enzyme possessing both FPP synthase and GPP synthase activities—under the control of the TDH3 promoter. It has been reported that an A99P mutation in ERG20 reduces FPP synthase activity while retaining GPP synthase activity, leading to GPP accumulation in yeast [64]. Accordingly, the GPP-producing strain was constructed by introducing pRS435GAP-ERG20(A99P) (S3 Table No.4). The GGPP-producing strain was generated by introducing pRS435GAP-GGF [65] (S3 Table No.2) encoding a fusion of Bts1 (geranylgeranyl diphosphate synthase) and Erg20, which enables GGPP production from FPP and IPP. Finally, pRS436GAP-PhTPS1 (S3 Table No.6) or pRS436GAP-ClTPS (S3 Table No.7) was transformed into each of the GPP-, FPP-, and GGPP-producing strains using the Frozen-EZ Yeast Transformation II Kit (Zymo Research, Irvine, CA, USA) to introduce TPS. The same substrate-producing strains were transformed with a pRS436GAP plasmid lacking TPS genes as a negative control (S3 Table No.5). Yeast strains expressing the pathway and carrying TPS plasmids were selected and maintained on synthetic dropout medium (SD-TRP-LEU-URA) according to plasmid markers and used in GC–MS analyses.
GC–MS analysis
All GC–MS experiments were performed using three independent biological replicates for each yeast strain. The terpene precursor (GPP, FPP, and GGPP) producing yeast carrying each TPS-expression vector and negative control strains—GPP-producing yeast (S3 Table No.19–21), FPP-producing yeast (S3 Table No.16–18), and GGPP-producing yeast (S3 Table No. 13–15)—were pre-cultured in 2 mL of selective medium (SD-TRP-LEU-URA) for 24 h. Yeast pellets were then collected through centrifugation at 500 × g for 1 min at 23 °C. Next, 2 mL of the pellets was transferred into 2 mL of fresh selective medium (SD-TRP-LEU-URA) and cultured for 72 h in a 20-mL screw-neck vial (GERSTEL). The headspace components were subsequently analyzed using GC–MS.
The GC–MS system (8890/5977C; Agilent Technologies, Santa Clara, CA, USA) was equipped with MPS roboticPRO (GERSTEL, Mülheim, Germany) for sample preparation under the following conditions. The vial containing the sample was heated at 60 °C for 15 min to equilibrate the headspace, and volatile compounds from the headspace were extracted for 30 min using a solid-phase microextraction fiber (DVB/CAR/PDMS, 23 gauge, fiber length: 2 cm; Merck, Darmstadt, Germany) before GC–MS injection. Separation was achieved on an HP-5MS UI column (length, 30 m; i.d., 0.25 mm; film thickness, 0.25 µm) under a constant helium carrier gas flow of 1.2 mL/min. The GC oven temperature was programmed as follows: an initial temperature of 40 °C was maintained for 5 min, increased to 280 °C at 10 °C/min, and maintained at 280 °C for 11 min. Detection was performed using a quadrupole mass selective detector operated at 70 eV in the combined SIM/SCAN mode. Specific target ions (m/z 68, 93, and 136) were monitored in the SIM mode, whereas ions in the range of m/z 30–250 were analyzed in the SCAN mode.
A mixture of C7–C33 n-alkanes (Hayashi Pure Chemical Ind., Ltd., Osaka, Japan) was analyzed under the same GC–MS conditions as those described above to calculate the Kovats retention index (RI). The RI values, calculated from the retention times of the analytes and n-alkane standards, were used alongside mass spectra to identify limonene and other volatiles. Mass spectral data were compared against the NIST/EPA/NIH Mass Spectral Library (NIST 20) for compound identification. Peak areas were compared across samples acquired under identical culture volumes, incubation times, and SPME extraction conditions; therefore, the values represent relative, semi-quantitative estimates rather than absolute concentrations.
Results
Tissue-specific DEG analysis of P. hadiensis
To rule out misidentification within the Plectranthus genus, we verified the species identity of the plant material using an RbcL barcode. The assembled RbcL protein was identical to the P. hadiensis reference (YP_010593282.1) (S3 Fig) and this step was used only for sample authentication. A sequencing analysis was conducted for RNA isolated from the leaf, stem, and root of P. hadiensis. The average read count was approximately 20,000,000 reads per sample (S4 Table). De novo assembly of the reads resulted in a reference cDNA sequence containing 43,489 genes, with an average contig length of 1,016 bases (S5 Table). BUSCO v5 indicated C = [70.1%] (S: [47.0%], D: [23.1%]), F = [8.3%], M = [21.6%] for the predicted proteins, supporting overall completeness of the reference (S5 Table). Application of our threshold (|M value| ≥ 2.5 and A value > 0) revealed 3,759 DEGs in leaf > root, 1,801 DEGs in root > leaf, 1,281 DEGs in leaf > stem, 1,265 DEGs in stem > leaf, 1,986 DEGs in root > stem, and 3,716 DEGs in stem > root. Furthermore, the GO term enrichment analysis revealed that terms associated with photosynthesis were significantly enriched in the leaf compared with those in the root. However, no other notable tendencies were observed (S4 Fig).
Expression profiles of TPS gene families
We explored coding sequences with two TPS-specific domains (TPS N-terminal and metal-binding domains) in the cDNA sequences. Our analysis revealed 26 TPS genes expressed in P. hadiensis. We subsequently classified these identified TPS genes into specific TPS subfamilies, based on phylogenetic placement relative to reference TPS sequences. The molecular phylogenetic tree analysis revealed that P. hadiensis possesses TPS genes belonging to five TPS subfamilies (Fig 2). Specifically, 5/26 (19%) TPS genes were assigned to TPS-a, 6/26 (23%) to TPS-b, 6/26 (23%) to TPS-c, 6/26 (23%) to TPS-e/f, and 3/26 (12%) to TPS-g. No TPS genes belonging to the TPS-d or TPS-h subfamilies were detected. For a subset of TPS genes, the subfamily assignment was marked as ambiguous due to closely related second-best phylogenetic placements among neighboring TPS clades (S6 Table).
Phylogenetic tree of the 26 TPS genes identified in this study.This maximum-likelihood tree includes amino acid sequences of P. hadiensis TPS genes together with reference TPS sequences from A. thaliana and representative Lamiaceae species. Branch lengths represent evolutionary distance, and node labels indicate ultrafast bootstrap support values (UFBoot2, 1,000 replicates). TPS subfamilies are indicated by color coding, and P. hadiensis TPS genes are shown in red. Subfamily assignments are based on nearest phylogenetic placement relative to reference TPS sequences (see S6 Table).
We evaluated the expression distribution of the 26 TPS genes across different tissues. The numbers of DEGs were as follows: 10 in leaf > root, 6 in root > leaf, 1 in leaf > stem, 11 in stem > leaf, 5 in root > stem, and 6 in stem > root. As shown in Fig 3A, the expression patterns of TPS genes varied depending on tissue type. The normalized heatmap (Fig 3A) further categorizes the TPS genes in P. hadiensis into those primarily highly expressed in the leaf and stem (upper 6 genes), in the stem (middle 11 genes), and in the root (lower 9 genes). The TPS genes highly expressed in the root did not include any members of family b, whereas those highly expressed in the stem predominantly belonged to families a, b, and g, as shown by the correspondence between TPS families (left column) and tissue-specific expression patterns in Fig. 3A. S7 Table concisely summarizes, for each TPS gene, the putative subfamily, top BLASTP annotation, and tissue-specific expression category; actual expression values (TPM, by tissue/replicate) are provided in S8 Table.
Tissue-specific expression profiles of TPS genes in P. hadiensis.(A) Z-score normalized tissue-specific expression profiles of TPS genes in P. hadiensis. The heatmap shows the 26 TPS genes with expression levels normalized to a mean of 0 and a variance of 1 (z-score normalization). The putative subfamily for each TPS gene is displayed on the left side of the heatmap. (B) A heatmap showing the expression levels of the 26 TPS genes in the leaf, stem, and root, presented as TPM. The total TPM per sample sums to one million, allowing for comparisons across different samples. (A and B) The color scale (red: higher, blue: lower) indicates relative expression. PhTPS1, indicated by an arrow, was highly expressed in the leaf and stem.
Of the 26 TPS genes identified, 1 (gene ID: DN2977_c0_g1_i1) exhibited a markedly high expression in the leaf and stem (Fig 3B, indicated by an arrow). This gene was designated as PhTPS1. In a BLASTP search for the amino acid sequence homology of PhTPS1, the top hit was a limonene synthase gene from rosemary (Salvia rosmarinus, Accession no. ABD77416.1), a member of the Lamiaceae family (S9 Table). Subsequent top hits predominantly matched monoterpene synthase genes from Lamiaceae species, suggesting a conserved role of the gene in monoterpene biosynthesis within this family. Considering the possibility of allelic variants, we remapped the RNA-Seq reads to PhTPS1. Two alleles, PhTPS1 allele1 and PhTPS1 allele2, differed by the presence or absence of a two-amino-acid insertion near the N-terminus (S5 and S6 Figs). Both alleles were expressed at approximately equal proportions in the leaf, stem, and root (S10 Table and S7 Fig). We subsequently cloned and introduced the shorter allele (S5 and S6 Figs, PhTPS1 allele2) into yeast to investigate its TPS activity and substrate specificity.
Characterization of the TPS activity of PhTPS1
We utilized the GPP-, FPP-, and GGPP-producing yeast lines because it was initially unclear whether PhTPS1 encodes a mono-, sesqui-, or diterpene synthase (S8 Fig). Overexpression of HMG1 in the pRS504HMG1/YPH499 background has been reported to activate the mevalonate pathway, thereby increasing the pool of the immediate isoprenoid precursors IPP and DMAPP [64]. Introducing ERG20, ERG20(A99P), or GGF into this background allowed us to direct the production of GPP, FPP, or GGPP, respectively, in order to evaluate substrate specificity. Notably, the A99P mutation in ERG20 has been reported to shift the enzyme activity away from FPP synthase while retaining GPP synthase function, thereby enabling GPP accumulation from IPP and DMAPP [65]. Plasmid inserts for PhTPS1 and ClTPS were sequence-confirmed using Sanger sequencing prior to transformation (Section 2.5). Yeast transformants were maintained under SD-TRP-LEU-URA selection, and limonene was detected in the PhTPS1- or *ClTPS-*expressing strains but not in empty-vector controls (Fig 4), supporting functional expression.
Peak areas of limonene detected in each yeast strain.Error bars indicate the standard deviation across three independent biological replicates.
We then expressed PhTPS1 or ClTPS in each of the GPP-, FPP-, or GGPP-producing strains and analyzed the resulting volatile compounds using GC–MS. A prominent peak corresponding to standard limonene was observed in GPP-producing yeast expressing either PhTPS1 or ClTPS (S9 Fig). Mass spectral analysis of this peak closely matched the reference library spectrum of limonene (S10 Fig). Fig 4 shows the GC–MS peak areas of limonene across the three substrate-producing strains. Limonene production was apparent in all three strains in the positive-control yeast expressing ClTPS, which displayed substantial peak areas. Although the GPP-producing strain expressing PhTPS1 yielded a smaller limonene peak than that expressing ClTPS, it produced considerably more limonene than the negative control. These findings confirm that PhTPS1 encodes a functional limonene synthase.
Discussion
In this study, we performed RNA-Seq across multiple tissues of P. hadiensis and identified 26 TPS genes using de novo transcriptome assembly. Expression profiling revealed marked tissue specificity among TPS genes, and one gene (PhTPS1, gene ID: DN2977_c0_g1_i1) exhibited exceptionally high expression in the leaf and stem. Such highly skewed TPS expression profiles have rarely been reported in transcriptome-wide surveys. A subsequent BLAST analysis revealed that PhTPS1 has high similarity to limonene synthase genes from other Lamiaceae species, suggesting its potential role in limonene biosynthesis. To validate this speculation, we consequently synthesized PhTPS1, cloned and expressed it in yeast, and analyzed the volatile compounds in the culture medium using GC–MS. Limonene production in PhTPS1-expressing yeast strains supported its role in limonene biosynthesis.
The identified TPS genes in P. hadiensis showed pronounced tissue specificity, indicating that TPS expression in P. hadiensis is not restricted to leaf tissue. This pattern is consistent with observations in other species, such as Rosa chinensis [66] and Panicum virgatum L. [67], where certain TPS genes are also predominantly expressed in the root. Such tissue-specific expression has been interpreted as evidence of functional diversification among TPS genes, with root-expressed genes contributing to belowground chemical defense, interactions with soil microorganisms, or stress induced terpene production, and stem-enriched genes being implicated in constitutive or inducible defense against herbivores and pathogens or in the regulation of volatile emissions along the plant axis [28,68,69]. Collectively, these findings suggest that TPS expression patterns in P. hadiensis are likely to reflect diversification of terpene functions across organs. However, P. hadiensis is distinguished by the exceptionally high expression of a single gene, PhTPS1, in both leaves and stems, implying that PhTPS1 may play a dominant role in the biosynthesis of a major terpene compound.
Consistent with this hypothesis, limonene was detected among volatile compounds extracted from the culture medium of PhTPS1-expressing yeast. Limonene levels were significantly higher in yeast strains engineered to produce GPP than in those producing FPP or GGPP. This difference is likely due to trace GPP synthesis in FPP-producing yeast or GGPP-producing yeast. However, the measurement method employed in this study provides a semi-quantitative estimate based on relative GC–MS peak areas without internal standards or absolute calibration curves. Nevertheless, this approach is sufficient for comparative analyses under identical experimental conditions. Accordingly, the pronounced enrichment of limonene in GPP-producing yeast is consistent with the biochemical role of GPP as the direct substrate for monoterpene biosynthesis, including limonene (Fig 1). In a previous study [29], limonene comprised a significantly high proportion (34.69%) of P. hadiensis extracts, and this is consistent with the high expression of PhTPS1 in P. hadiensis observed in our study. Although TPS gene families in plants, particularly in Lamiaceae, often exhibit redundancy and sub-functionalization [13,69], individual TPS genes can still act as major determinants of essential oil composition [69,70]. Taken together, the exceptionally high expression and functional validation suggest PhTPS1 as a key contributor to limonene biosynthesis in P. hadiensis.
We identified five TPS subfamilies in P. hadiensis (TPS-a, TPS-b, TPS-c, TPS-e/f, and TPS-g), whereas TPS-d and TPS-h, largely found in gymnosperms and lycophytes [13], respectively, were not detected. The relative proportions of TPS subfamilies in P. hadiensis were 19% (TPS-a), 23% (TPS-b), 23% (TPS-c), 23% (TPS-e/f), and 12% (TPS-g). This composition broadly matches previous surveys of Lamiaceae TPS repertoires, in which TPS-a, -b, -c, and -e/f constitute the major components [53]. However, the proportion of TPS-e/f genes in P. hadiensis (23%) is higher than the average reported for Lamiaceae species (approximately 11%) [53]. A similarly elevated representation of TPS-e/f genes has been reported in Mentha longifolia, in which approximately 27% of TPS genes belong to this subfamily [20]. These observations suggest that expansion of TPS-e/f genes may have occurred independently in multiple Lamiaceae lineages, including P. hadiensis.
Although 26 TPS genes were identified in this study, additional TPS genes may have remained undetected. For instance, Arabidopsis thaliana encodes 32 TPS genes [71], whereas Oryza sativa, Sorghum bicolor, and Selaginella moellendorffii harbor 34, 24, and 14 full-length TPS genes, respectively [13]. The number of identified TPS genes can vary according to detection methods. For example, Yan et al. [66] identified 40, 39, and 13 TPS genes in O. sativa, S. bicolor, and S. moellendorffii, respectively. In Lamiaceae, more comprehensive genome-based studies have reported larger repertoires—for example, 52 TPS genes were identified in Callicarpa americana using long-read genome assembly [72]. Similarly, a recent genome-wide analysis of four economically important culinary herbs—sweet basil (Ocimum basilicum L.), sweet marjoram (Origanum majorana L.), oregano (Origanum vulgare L.), and rosemary (Rosmarinus officinalis L.)— identified 235 TPS genes in total, ranging from 27 in O. majorana to 137 in the tetraploid O. basilicum [73]. These studies suggest that P. hadiensis may harbor a larger TPS repertoire than indicated by our data. This limitation likely reflects the design of our reference resource: a de novo transcriptome assembled from three tissues grown under indoor conditions. Consequently, the BUSCO score reflects expressed gene space rather than the full genomic repertoire. In addition, transcriptome assembly was performed using subsampled RNA-seq data (1 million reads per sample) to reduce computational requirements, which may have further reduced recovery of lowly expressed transcripts. Accordingly, BUSCO completeness of our assembly (70.1%) was lower than that reported for some other non-model Lamiaceae transcriptomes (e.g., S. miltiorrhiza drought transcriptome, C: 88.2% [74]; S. hispanica tissue transcriptomes, C: 92.2%–92.8% [75]). Despite these constraints, our transcriptome assembly was sufficient to identify 26 TPS candidates and to enable functional validation of PhTPS1. Future genome-wide assembly could reveal additional TPS genes and provide a stronger foundation for functional and evolutionary analyses in P. hadiensis. In addition, integration of this transcriptomic resource with future genome assemblies, comprehensive volatilome profiling across chemotypes, and targeted in planta functional assays (e.g., CRISPR/Cas-based approaches) will further expand its utility beyond the characterization of limonene biosynthesis.
Conclusion
In this study, we confirmed that PhTPS1 encodes an active enzyme involved in limonene biosynthesis and is functional when expressed in yeast. The identification of an active TPS gene in P. hadiensis is particularly important because not all plant TPS genes are functional [13]. Thus, this study not only contributes to breeding efforts targeting PhTPS1 for enhanced limonene production but also aids in the evolutionary understanding of TPS genes within the Lamiaceae family.
Supporting information
S1 TableTPS genes used for classification.(XLSX)
S2 TablePrimer list.(XLSX)
S3 TablePlasmids, vectors, and yeast strains used in this study.All the plasmids and vectors contained the TDH3 promoter and CYC1 terminator from S. cerevisiae for gene expression.(XLSX)
S4 TableSequencing results for each sample.(XLSX)
S5 TableResults of de novo assembly under various conditions.(XLSX)
S6 TableTPS genes used for phylogenetic classification and subfamily assignment in P. hadiensis.(XLSX)
S7 TableSummary of TPS genes with phylogeny-based subfamily assignment and tissue-resolved expression (H/M/L/ND; High/Medium/Low/Not detected).Expression categories were assigned per tissue using quartiles of the mean across replicates: ND = 0; L ≤ Q25; Q25 < M ≤ Q75; H > Q75. Thresholds: leaf Q25 = 0.15, Q75 = 5.56; stem Q25 = 2.62, Q75 = 42.78; root Q25 = 0.39, Q75 = 13.37 (normalized units).(XLSX)
S8 TableRaw TPM values for TPS genes by tissue and replicate.(XLSX)
S9 TableBlast homology search results for the amino acid sequence of PhTPS1.The amino acid sequence of PhTPS1 was used as the query sequence. The non-redundant protein sequences (nr) database was used for the homology search, and the BLASTP algorithm was used for analysis.(XLSX)
S10 TableNumber and proportion of reads corresponding to allele1 and allele2 in the remapping to PhTPS1.(XLSX)
S11 TableTranscript IDs, TSA accession numbers, and CDS coordinates of analyzed TPS genes.(XLSX)
S1 FigP. hadiensis tissues used in this study (from left to right: whole plant, leaf and stem, and root).(PDF)
S2 FigWorkflow for TPS identification, expression profiling, and yeast validation in P. hadiensis.(PDF)
S3 FigAmino acid sequence alignments of the putative RbcL protein (TRINITY_DN2794_c0_g1_i1) and the known P. hadiensis RbcL protein (YP_010593282.1).(PDF)
S4 FigGO (biological process) terms enriched in the leaf relative to those in the root.(PDF)
S5 FigNucleotide sequence alignment of PhTPS1 Allele1 and Allele2.(PDF)
S6 FigProtein sequence alignment of PhTPS1 Allele1 and Allele2.(PDF)
S7 FigIGV visualization of reads remapped to PhTPS1, highlighting the magnified region where two PhTPS1 alleles are present.The gray vertical bars at the top indicate the read depth at each nucleotide position. The depth from position 66 to position 71 is approximately half of that at the surrounding positions, suggesting the presence of a six-nucleotide deletion in one allele (allele2) and retention of these six nucleotides in the other allele (allele1).(PDF)
S8 FigOverview of the workflow for yeast strain generation in this study.The numbers in square brackets indicate the identifiers (numbers) of the plasmids, vectors, or yeast strains listed in S3 Table.(PDF)
S9 FigTotal ion chromatograms of headspace samples from vials containing cultured GPP-producing yeast, as analyzed using GC–MS. The top panel shows an extracted ion chromatogram focusing on m/z 93 at retention times around 13.1 min.(PDF)
S10 FigComparison of the mass spectra.The upper spectrum represents the mass spectrum of the volatile compound in the headspace of vials containing cultured PhTPS1/GPP. The lower spectrum shows the reference library spectrum of limonene from the NIST 20 Mass Spectral Library. Major fragment ions at m/z 68, 93, 121, and 136 indicate a high similarity between the sample and library spectra.(PDF)
S1 AppendixNucleotide sequences (coding sequences) of the 26 TPS genes identified in P. hadiensis.(CDS)
S2 AppendixAmino acid sequences of the 26 TPS proteins identified in P. hadiensis.(PEP)
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Dhifi W, Bellili S, Jazi S, Bahloul N, Mnif W. Essential Oils’ Chemical Characterization and Investigation of Some Biological Activities: A Critical Review. Medicines (Basel). 2016;3(4):25. doi: 10.3390/medicines 3040025 28930135 PMC 5456241 · doi ↗ · pubmed ↗
- 2Moss GP, Smith PAS, Tavernier D. Glossary of class names of organic compounds and reactivity intermediates based on structure (IUPAC Recommendations 1995). Pure and Applied Chemistry. 1995;67(8–9):1307–75. doi: 10.1351/pac 199567081307 · doi ↗
- 3Chen R, Wang M, Keasling JD, Hu T, Yin X. Expanding the structural diversity of terpenes by synthetic biology approaches. Trends Biotechnol. 2024;42(6):699–713. doi: 10.1016/j.tibtech.2023.12.006 38233232 · doi ↗ · pubmed ↗
- 4Christianson DW. Structural and Chemical Biology of Terpenoid Cyclases. Chem Rev. 2017;117(17):11570–648. doi: 10.1021/acs.chemrev.7b 00287 28841019 PMC 5599884 · doi ↗ · pubmed ↗
- 5Geron C, Rasmussen R, Arnts RR, Guenther A. A review and synthesis of monoterpene speciation from forests in the United States. Atmos Environ. 2000;34:1761–81. doi: 10.1016/S 1352-2310(99)00364-7 · doi ↗
- 6Seigler DS. Sesquiterpenes. In: Seigler DS, editor. Plant secondary metabolism. Boston: Springer. 1998:277–303.
- 7Bose SK, Yadav RK, Mishra S, Sangwan RS, Singh AK, Mishra B, et al. Effect of gibberellic acid and calliterpenone on plant growth attributes, trichomes, essential oil biosynthesis and pathway gene expression in differential manner in Mentha arvensis L. Plant Physiol Biochem. 2013;66:150–8. doi: 10.1016/j.plaphy.2013.02.011 23514759 · doi ↗ · pubmed ↗
- 8Lanzotti V. Diterpenes for Therapeutic Use. Natural Products. Springer Berlin Heidelberg. 2013:3173–91. doi: 10.1007/978-3-642-22144-6_192 · doi ↗
