Genome-Wide Identification and Analysis of Plant Cysteine Oxidase (PCO) Family Genes and Expression Pattern Under Abiotic Stresses in Medicago sativa
Rui Wang, Xiaojie Zhang, Xiao Han, Lili Gu, An Yan, Wenxian Yang, Yiqiang Ren, Zhenwei Ren

TL;DR
This paper identifies and analyzes cysteine oxidase genes in alfalfa, revealing their role in stress responses and plant growth.
Contribution
The study systematically identifies and characterizes PCO genes in Medicago sativa for the first time.
Findings
35 MsPCO genes were identified and analyzed in Medicago sativa.
MsPCO genes are distributed asymmetrically and clustered into five subgroups.
Promoter elements suggest roles in stress adaptation and hormone signaling.
Abstract
Plant cysteine oxidase (PCO) catalyzes the oxidation of cysteine residues in the N-degron pathway, thereby regulating the stability and activity of the seventh group of ethylene response factors (ERF-VII), which play a crucial role in reactive oxygen species (ROS)-mediated signal transduction. By regulating the degradation of ERF-VII, the PCO family genes control hormone signaling, which is highly valuable for plant growth and abiotic stress responses. However, systematic studies on PCO genes in Medicago sativa, a key forage legume, remain lacking. Herein, 35 MsPCO genes were identified from the alfalfa (Medicago sativa) genome, and their biological characteristics were comprehensively analyzed via bioinformatics approaches. The results showed that MsPCO genes are asymmetrically distributed across 18 chromosomes and clustered into 5 subgroups phylogenetically. Most MsPCO proteins are…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10- —Key Research and Development Project of Xinjiang Uygur Autonomous Region
- —Resource platform construction project of Xinjiang Uygur Autonomous Region
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNitrogen and Sulfur Effects on Brassica · Sulfur Compounds in Biology · Folate and B Vitamins Research
1. Introduction
The stability and functional activity of Ethylene Response Factor group VII (ERF-VII) transcription factors are crucially regulated by plant cysteine oxidases (PCOs), which are fundamental regulators in the N-degron pathway. A critical post-translational modification that regulates the turnover of these transcription factors, the oxidation of N-terminal cysteine residues in ERF-VII proteins, is uniquely mediated by PCOs. ERF-VII proteins are essential components of signaling cascades driven by Reactive Oxygen Species (ROS) and coordinate the transcriptional activation of numerous stress-responsive genes. In addition to participating in hormonal signaling pathways influenced by abscisic acid and gibberellins, this regulatory axis allows plants to precisely adjust their adaptive responses to a wide range of abiotic challenges, such as salinity, severe temperatures and drought [1].
PCOs belong to the cysteine dioxygenase family and participate in the regulation of ERF-VII transcription factors. A distinctive feature of ERF-VII proteins lies in their dual environmental sensing capacity: they directly perceive fluctuations in oxygen availability via the Cys/Arg branch of the N-end rule pathway, while also indirectly responding to nitric oxide (NO) signals generated downstream of nitrate reductase (NR)-mediated metabolic processes [2]. This unique molecular regulatory module, anchored in the cysteine branch of the N-end rule pathway, endows ERF-VII transcription factors with the ability to integrate multiple environmental cues and coordinate adaptive responses to a wide array of abiotic stresses [3]. Furthermore, accumulating evidence has established ERF-VII proteins as central hubs in plant hormone signaling networks, with particular relevance to mediating crosstalk between gibberellin and abscisic acid pathways. ERF-VIIs cooperatively coordinate plant developmental programs and stress adaptation tactics through this hormonal cross-regulation [1].
Extensive investigations have solidified the understanding that transcription factors in Glycine max modulate gene expression programs through specific binding to cis-regulatory DNA sequences located in the 5′-upstream regions of target genes [4]. For instance, the transcription factor HSFB2b has been experimentally validated to directly interact with GmC4H and GmCHS3, thereby regulating their transcriptional activity under salt stress conditions—an interaction supported by the presence of conserved heat shock elements (HSEs) within the promoter regions of these two target genes [5]. This established model of transcriptional control offers a basis for examining gene families across different plant species. Nonetheless, existing understanding of the PCO gene family is still phylogenetically limited and functionally specific. Thus far, functional investigations of PCO have predominantly been performed in Marchantia polymorpha, Klebsormidium nitens, Arabidopsis thaliana and Brassica napus [6,7,8]. While research in A. thaliana has elucidated the conserved, core role of PCOs in oxygen sensing [7], studies in B. napus have been particularly revealing for stress biology. This crop not only retains conserved hypoxia signaling pathways but also exhibits a notable expansion and functional diversification of its PCO gene family in response to a broad spectrum of abiotic stresses, thereby providing a critical evolutionary and functional bridge [8]. Against this backdrop, a primary objective of this study is to investigate whether a similar functional diversification and stress-responsive expansion of the PCO family exists in alfalfa, a phylogenetically distant and agronomically crucial forage legume. The rationale for this comparative focus is twofold. First, B. napus currently represents one of the best-characterized systems for PCO-mediated stress responses beyond hypoxia. In line with the established regulatory paradigm, promoters of BnaPCO genes harbor diverse stress and hormone-responsive cis-elements, and their expression is distinctly responsive to drought, freezing, cold and waterlogging [8]. Second, and more pertinent to forage improvement, the abiotic stresses (e.g., drought, waterlogging) documented in B. napus are among the primary constraints that severely limit biomass yield and persistence in perennial forage legumes like alfalfa. Specific examples, such as the waterlogging-induced, root-specific expression of BnaC04G0395100ZS [9], underscore that stress responses are orchestrated through intricate, tissue-specific regulatory networks [4].
Despite these advances, the PCO gene family remains entirely uncharacterized in forage legumes. This constitutes a significant knowledge gap, as the regulatory paradigms and stress-adaptive roles discovered in B. napus may not directly translate to the distinct physiology and genetic architecture of legumes. Therefore, by framing our investigation against the well-documented biology of B. napus PCOs, this study seeks to establish a comparative benchmark. Our ultimate goal is to elucidate whether alfalfa has evolved unique mechanisms of PCO gene regulation and function tailored to its specific ecological and agronomic stress profiles.
Medicago sativa is a globally vital forage crop, valued for its high nutritional quality and environmental adaptability [10]. However, its productivity is increasingly threatened by abiotic stresses, with salinity and drought being primary constraints [11]. This underscores the urgent need to develop more resilient cultivars, which depends on a deeper molecular understanding of stress adaptation.
Notably, while PCO genes are implicated in drought, salt, and cold stress responses in B. napus [8], their physiological roles in alfalfa are completely unknown. To address this gap, we conducted the first genome-wide identification and phylogenetic analysis of the PCO family in M. sativa. We characterized the expression patterns of MsPCO genes across tissues and under key abiotic stresses, leading to the screening and prioritization of core candidate genes (notably MsPCO19 and MsPCO20) based on their responsiveness and subcellular localization. Through this systematic approach, we aim to provide a foundational resource and concrete genetic targets for future functional validation and molecular breeding, ultimately contributing to enhanced abiotic stress tolerance in this vital forage crop.
2. Results
2.1. Identification and Chromosomal Localization of the MsPCO Gene Family
Through the integration of comprehensive hidden Markov model (HMM) analysis and BLAST sequence alignment using TBtools (version 2.364), we successfully identified 35 PCO genes in the genome of M. sativa. Following this identification, chromosomal localization analysis revealed an uneven distribution of these MsPCO genes across the genome. Specifically, 34 of the 35 identified MsPCO genes were mapped to 18 distinct chromosomes, while one gene (designated MsPCO35) could not be assigned to any chromosome due to incomplete genomic assembly (Figure 1); the 34 mapped genes were systematically named MsPCO1 to MsPCO34 based on their respective positions on the chromosomes, following the standard nomenclature for plant gene families. The distribution frequency of MsPCO genes varied significantly among the mapped chromosomes: chromosomes chr2.2, chr2.3, chr6.2 and chr7.4 exhibited the highest gene density, each harboring 3 MsPCO members, followed by chromosomes chr2.1, chr2.4, chr4.2, chr5.4, chr6.3, chr7.1, chr7.2 and chr7.3, which each contained 2 MsPCO genes, whereas chromosomes chr3.2, chr4.1, chr5.2, chr5.3, chr6.1 and chr6.4 had the lowest gene abundance, with only 1 MsPCO gene localized to each.
2.2. Phylogenetic Analysis of the PCO Family
To elucidate the evolutionary relationships of PCO proteins, we performed phylogenetic analysis incorporating sequences from five representative plant species—M. sativa, Medicago truncatula, A. thaliana, G. max and Oryza sativa—and the resulting neighbor-joining phylogenetic tree classified all PCO family members into five distinct subfamilies (Figure 2). Notably, MsPCO proteins from alfalfa did not form species-specific monophyletic clusters but exhibited an interspersed distribution pattern across the different subfamilies, with MsPCO members clustering closely with their orthologs from other species. This evolutionary topology strongly suggested that PCO genes within the same subfamily shared conserved functional roles throughout plant evolution, consistent with the notion that orthologous genes retaining similar phylogenetic positions tend to preserve ancestral biological functions.
2.3. Analysis of Protein Physicochemical Properties and Subcellular Localization Prediction
The alfalfa PCO gene family’s molecular weights ranged from 9.96 to 56.91 kDa, and their theoretical isoelectric points (pI) covered a wide range of 4.23 to 8.53, reflecting variations in amino acid composition and charge distribution, according to physicochemical characterization of the 35 MsPCO proteins. According to stability studies using the instability index, MsPCO7 was the only stable member of the MsPCO family, with a value below 40, which is generally seen as a marker of protein stability. Hydrophobicity analysis further highlighted the uniqueness of MsPCO7, as it was the only protein with a grand average of hydropathicity (GRAVY) value greater than 0 (indicating hydrophobicity), whereas all other MsPCO proteins were classified as hydrophilic (GRAVY < 0), a finding consistent with their aliphatic indices (all below 100), which correlate with hydrophilicity in plant proteins. Subcellular localization predictions indicated that the majority of MsPCO proteins are targeted to the cytoplasm, while a subset of members displayed distinct organellar localization patterns: MsPCO1/4/6/9/16 and 27 were predicted to localize to the nucleus, MsPCO12 to chloroplasts, and MsPCO15/17 and 18 to peroxisomes (Table S1).
2.4. MsPCO Amino Acid Motif, Protein Conserved Domain, and Gene Structure Analysis
Comprehensive analysis of conserved motifs combined with phylogenetic relationship inference revealed that the 35 MsPCO proteins in alfalfa harbor ten distinct conserved motifs, with notable conservation patterns across the family: Motif1 was present in all MsPCO proteins except MsPCO7/13/14, while Motif7 was conserved across all members excluding MsPCO7/20/23, reflecting functional constraints on key structural elements within the PCO family. Domain analysis further confirmed structural and functional conservation of the MsPCO family, as all 35 proteins contained either the canonical PCO_ADO domain or its superfamily variant—a core structural feature essential for PCO enzymatic activity. Gene structure analysis uncovered substantial variability in exon–intron organization among MsPCO genes: MsPCO16 exhibited the most complex architecture with 12 exons and 11 introns, whereas MsPCO3/7/11/21 and 32 displayed the simplest structure (three exons and two introns), with intermediate configurations observed in MsPCO1/13/17/18/35 (four exons and three introns), MsPCO6/9 (six exons and five introns), and the remaining members (five exons and four introns) (Figure 3). Sequence alignment further validated the presence of the PCO_ADO domain in all MsPCO proteins, with the incomplete Motif1 observed in MsPCO7/13/14 attributed to their relatively shorter amino acid sequences, which likely truncate this conserved motif (Figure S1, Table S2).
2.5. Prediction of Secondary Structure and Three-Dimensional Structure of MsPCO Protein
Secondary structure analysis of MsPCO proteins revealed four distinct structural elements: α-helices, β-sheets, β-turns, and random coils (Table S3). Quantitative analysis demonstrated substantial structural variation among MsPCO1-35, with α-helical content ranging from 1.18% to 32.16% (mean = 18.45%). The β-sheet content varied between 15.83% and 34.06% (mean = 19.77%), while β-turns showed the most restricted distribution, ranging from 0% to 4.31% (mean = 2.32%). Notably, random coils constituted the predominant structural component, accounting for 47.65% to 70.59% of the structure (mean = 59.47%) (Table S3). These findings indicate that random coils serve as the primary structural element within the MsPCO protein family. The prediction results of the three-dimensional structure of the protein obtained from Swiss-Model are generally consistent with the secondary structure prediction (Figure 4).
The interaction network analysis revealed that the Arabidopsis homologous protein family of the alfalfa PCO family is located at the core of the interaction network (as shown in the Figure 5). A total of 20 potential interacting proteins were predicted, involving multiple functional families, including: aspartic protease family (e.g., ATE1.2, ATE2), ubiquitination-related proteins (e.g., UBN1), oxidative stress response proteins (e.g., HRA1), zinc finger proteins (e.g., LBD41), and metabolic enzymes (e.g., ADH1, PDC1, CYS2). The functions of these interacting proteins encompass multiple biological processes such as protein modification, stress response, and metabolism, providing candidate interaction targets for further elucidation of the functional mechanisms of the alfalfa PCO family proteins (Figure 5).
2.6. Cis-Acting Element Analysis of MsPCO Promoters
Comprehensive analysis of cis-regulatory elements within the 2000 bp upstream promoter regions of MsPCO genes identified four major functional categories—hormone-responsive, stress-responsive, light signal-responsive, and growth/development-related elements—each encompassing a diverse set of motif types (Figure 6): hormone-responsive elements included auxin-responsive motifs (TGA-element, TGA-box, AuxRR-core), gibberellin-responsive motifs (GARE-motif, P-box), abscisic acid-responsive elements (ABRE), salicylic acid-responsive elements (TCA-element), and MeJA-responsive motifs (TGACG-motif, CGTCA-motif); stress-responsive elements comprised low-temperature response elements (LTR), defense/stress response elements (TC-rich repeats), anaerobic induction elements (ARE), drought response elements (MBS), and flavonoid biosynthesis elements (MBSI); light-responsive elements included ACE, G-Box, and GT1-motif; while growth/development-related elements encompassed AT-rich element, Box III, circadian, GCN4_motif, and O2-site. Quantitative analysis further revealed that all 35 MsPCO genes contained 1-22 hormone-responsive elements, and all members except MsPCO4 harbored 1-11 stress-responsive elements, with the widespread prevalence of these functionally diverse regulatory elements strongly suggesting that MsPCO genes may mediate multifaceted roles in integrating hormonal signaling pathways and orchestrating adaptive responses to various environmental stresses across plant species.
2.7. MsPCO Gene Collinearity Analysis
Genomic collinearity analysis of the MsPCO gene family uncovered 56 gene duplication events among the 35 identified alfalfa PCO genes (Figure 7A), with these duplication events displaying diverse evolutionary patterns—including one-to-one, one-to-two, one-to-three, one-to-four, one-to-five, and one-to-six orthologous relationships—which indicates that 24 of the 35 MsPCO genes had undergone duplication during alfalfa genome evolution. Evolutionary selection pressure analysis of the collinear MsPCO gene pairs further revealed that all identified gene pairs exhibited non-synonymous substitution rate/synonymous substitution rate (Ka/Ks) ratios less than 1 (Table S4), a finding that provides strong empirical evidence for purifying selection acting on the MsPCO gene family, suggesting that these genes were functionally constrained during evolution to retain conserved biological roles relevant to alfalfa growth and stress adaptation.
To gain deeper insights into the evolutionary trajectory of the MsPCO gene family, we extended our collinearity analysis to a multi-species framework, incorporating three representative plant species—M. truncatula, A. thaliana and G. max—to explore interspecific syntenic relationships. The results revealed striking variations in the number of syntenic gene pairs between M. sativa and the three species: 38 syntenic gene pairs were identified between M. sativa and M. truncatula, 13 pairs between M. sativa and A. thaliana, and a notably higher number (75 pairs) between M. sativa and G. max (Figure 7B). These findings underscore a strong evolutionary affinity between M. sativa and the three analyzed species, with the closest syntenic relationship observed between M. sativa and G. max—likely reflecting their shared leguminous ancestry. Beyond identifying evolutionary relationships, this multi-species comparative analysis offers a strong theoretical basis for the subsequent functional validation of MsPCO genes and the creation of stress-tolerant alfalfa cultivars using gene-based molecular breeding techniques. It also facilitates a more thorough understanding of the conservation traits and evolutionary differentiation patterns of the PCO gene family across various plant lineages.
2.8. Expression Analysis of MsPCO Genes Under Different Abiotic Stresses
Our expression analysis revealed that MsPCO family genes exhibit distinct, treatment-specific expression patterns across the tested abiotic stresses and phytohormone signaling pathways, reflecting functional divergence in their potential roles in mediating alfalfa’s adaptive responses to environmental cues (Figure 8A–H).
Under ethylene (ETH) treatment, MsPCO20 expression was rapidly induced at 3 h, reached its peak at 6 h, and then gradually declined while remaining significantly elevated compared to the control (Figure 8A). For PEG-induced drought stress, MsPCO19 exhibited a dramatic upregulation at 48 h, MsPCO20 showed a biphasic regulatory pattern with distinct peaks at 12 h and 48 h, and MsPCO26 displayed early induction as early as 3 h post-treatment (Figure 8B). NaCl-mediated salt stress elicited transient upregulation of MsPCO13 at 3 h, whereas MsPCO19 showed delayed induction with peak expression at 48 h (Figure 8C). Cold stress (4 °C) triggered rapid activation of MsPCO19 at 3 h, followed by a gradual decline, in contrast to MsPCO26, which exhibited a progressive increase in expression, peaking at 24 h (Figure 8D). Abscisic acid (ABA) treatment resulted in late induction of MsPCO21 at 24 h, early suppression of MsPCO12 followed by subsequent recovery, and transient upregulation of MsPCO26 at 6 h (Figure 8E). Gibberellic acid (GA) treatment notably upregulated MsPCO13 and MsPCO19 at 48 h, while suppressing the majority of MsPCO genes during the early (3-12 h) treatment period, suggesting a potential negative regulatory role of GA in MsPCO gene expression (Figure 8F). In response to salicylic acid (SA), MsPCO20 showed a sharp induction at 12 h, while MsPCO26 exhibited sustained upregulation from 3 h until a decline at 24 h (Figure 8G).
In summary, the MsPCO gene family of M. sativa exhibits complex and divergent temporal expression dynamics in response to seven distinct stress and phytohormone treatments—including ethylene (ETH), PEG-induced drought, NaCl-mediated salinity, 4 °C cold stress, and the phytohormones abscisic acid (ABA), gibberellic acid (GA), and salicylic acid (SA)—with specific family members displaying dramatic expression fluctuations ranging from several-fold to orders of magnitude at key time points across the treatment periods. These expression profiles strongly support the notion that MsPCO genes serve as crucial regulatory hubs in alfalfa, potentially integrating signals from both abiotic stress pathways and phytohormone signaling networks to orchestrate coordinated adaptive responses, with their differential expression patterns reflecting functional specialization in mediating distinct stress tolerance and developmental processes relevant to alfalfa ecological adaptation and agricultural productivity.
2.9. Tissue-Specific Expression Profiling of MsPCO Genes in M. sativa
Tissue-specific expression profiling of MsPCO genes in M. sativa uncovered distinct spatial regulation patterns across six key plant organs-roots, stems, leaves, flower buds, flowers and pods—providing insights into their potential functional specialization during alfalfa development (Figure 9). Quantitative analysis of expression levels identified several well-defined expression clusters: MsPCO12/13/21 showed predominant expression in leaves; MsPCO4/6/12/13/19/21 exhibited preferential accumulation in flower buds; MsPCO6/11/12/13/19/26 in mature flowers; and MsPCO13/21/26 in pods (reflecting tissue-specific enrichment in reproductive organs); while MsPCO5/7/16/20/34 maintained consistently low expression levels across most examined tissues. The majority of MsPCO genes displayed elevated expression in floral organs (flower buds, flowers and pods), with distinct family members exhibiting specialized expression patterns in specific tissue types—collectively suggesting that functional diversification has occurred within the MsPCO gene family to support tissue-specific developmental processes and potentially coordinate reproductive growth with stress adaptation in alfalfa.
2.10. Subcellular Localization Analysis
Candidate genes were chosen in this work to identify essential targets for further exploration of abiotic stress response pathways. The qRT-PCR expression profiles indicated that MsPCO19 and MsPCO20 showed strong and consistent upregulation under various stresses, including drought, salinity, and low temperature, implying their potential function as key regulators in comprehensive stress adaptation. Consequently, these two genes were chosen for a subcellular localization study. To experimentally validate the subcellular localization of selected MsPCO gene family members, we constructed recombinant subcellular localization vectors by inserting the coding sequences of target genes into the PCAMBIA2300-GFP expression vector (fusing the MsPCO proteins with green fluorescent protein, GFP) and achieved transient expression in plant cells via Agrobacterium-mediated infiltration—a widely used and reliable method for rapid subcellular localization analysis (Figure 10). In accordance with the functional versatility suggested by its differential expression across stresses and tissues, fluorescence microscopy observations of the GFP signal showed distinct subcellular targeting patterns: MsPCO20 exhibited dual subcellular localization, with GFP fluorescence detected in both the nucleus and cell membrane, whereas MsPCO19 protein was specifically localized to the nucleus.
3. Discussion
The PCO family serves as the core group of O_2_-sensing enzymes in plants, as these enzymes not only catalyze O_2_-dependent reaction steps but also regulate the turnover of ERF-VII transcription factors through the N-degron pathway—specifically targeting the methionine-cysteine (Met-Cys) dipeptide [12]. This conserved mechanism mediates the transduction of Reactive Oxygen Species (ROS)-related hormonal signals, thereby enabling PCO to participate in key biological processes such as root growth, pollen tube elongation and abiotic stress responses [13]. While the functions of PCO in abiotic stress adaptation and growth regulation have been validated in B. napus, M. sativa—a globally crucial leguminous forage crop—has emerged as a research focus for improving stress resistance, yield and quality through molecular breeding, a goal that hinges on the systematic characterization of key functional gene families. To date, several gene families in alfalfa have been comprehensively studied, including those associated with growth regulation and stress response (e.g., NAC, ERF and DOF) [14,15,16], secondary metabolism and forage quality (e.g., MYB) [17], and signal transduction (e.g., HD-ZIP and CML) [18,19,20], providing critical foundational support for molecular breeding efforts. However, systematic research on the PCO family—pivotal for plant O_2_ perception and stress response—has not been reported in alfalfa, and overall studies on this family remain limited: only five PCO members have been identified in A. thaliana, while 20, 8 and 7 members were characterized in three Brassica species [8]. In order to fill this knowledge gap, the current study used bioinformatics techniques to thoroughly identify and characterize the PCO family in alfalfa based on its reference genome. In the end, 35 MsPCO members were screened, which is a significantly higher number than that reported in other plant species. This is likely due to alfalfa’s high genomic heterozygosity and autotetraploid nature [21,22].
Protein domains represent fundamental functional units that underpin the execution of biological roles [23], and previous studies have established that the catalytic activity and functional specificity of PCO proteins depend on the conserved PCO_ADO domain [8]. The diversity in physicochemical properties and the predominant cytoplasmic localization of the identified MsPCO proteins are consistent with their proposed roles in intracellular O_2_ sensing and signal transduction, providing a structural and spatial basis for potential functional diversification within stress-responsive networks. Protein—protein interactions (PPIs) form the fundamental basis for sustaining plant life activities. Understanding protein functions and their underlying mechanisms enables a more accurate comprehension of biological processes and further reveals the regulatory patterns and mechanisms governing organismal life processes [24]. PCOs represent one of the core regulators in plant responses to abiotic stress. We predicted the PPI network of PCO by aligning sequences in the STRING database. The results indicated that PCO occupies a central position in this network and exhibits direct interactions with multiple proteins known to be involved in abiotic stress responses. Notably, PRT6 emerged as a key component within the PCO interaction network. As an E3 ubiquitin ligase, PRT6 targets ERF-VII transcription factors (e.g., RAP2.2, RAP2.12, both of which also directly interact with PCO) for degradation via the 26S proteasome pathway. PCO promotes the recognition of ERF-VII factors by PRT6 through oxidizing specific cysteine residues in ERF-VII proteins, thereby participating in the signaling regulation of abiotic stresses such as hypoxia and salt stress [25]. Furthermore, PCO interacts with CYS2, a cysteine synthase whose encoding gene is significantly induced under drought and high-salt stresses. Deficiency in CYS2 enhances plant sensitivity to oxidative stress, highlighting its protective role in abiotic stress responses [26].
Gene family evolution is mostly driven by variation in gene structure: eukaryotic gene evolution is frequently accompanied by intron gain or loss (with intron loss occurring more frequently) [27], and differences in exon—intron organization can directly influence protein function, gene expression patterns and regulatory mechanisms [28]. Additionally, LTR retrotransposons—widespread in legume genomes—can insert into introns, promoters, or other genomic regions to alter gene structure and function [29]. Notably, our gene structure analysis revealed that MsPCO family members share conserved motif compositions and similar exon—intron arrangements, a characteristic that typically indicates retention of core family-specific functions among homologous genes, thereby providing valuable structural insights to guide subsequent functional characterization of MsPCO proteins.
Our research showed that the 35 identified MsPCO genes have an uneven distribution throughout the chromosomes of M. sativa. This is a common feature of gene families in most plant species [30] and is likely connected to important evolutionary chromosomal events like deletions, insertions, gene duplications, and inversions [31]. Gene duplication is essential for the expansion of gene families and functional innovation in plants, as well as for their adaptation to changing environmental conditions [32], and it primarily drives the expansion of plant gene families through two major pathways: segmental duplication and tandem duplication [33]. Segmental duplication events are more prevalent in slowly evolving genomes [34]. Our analysis revealed that the expansion of the MsPCO family was predominantly driven by segmental duplication, and all duplicated gene pairs exhibited Ka/Ks ratios of less than one, indicating that they have undergone strong purifying selection. By eliminating deleterious mutations and preserving functionally critical structural features [28], this purifying selection process helps maintain the stability of core biological functions within the MsPCO family, collectively providing robust evolutionary evidence supporting the functional conservation of this gene family in alfalfa.
Since abiotic stress can severely impede plant growth and development and even cause plant death, it is a key environmental constraint affecting crop yield. In response, plants have evolved diverse, sophisticated regulatory mechanisms to adapt to such adverse conditions [35,36]. A key component of these adaptive mechanisms is the precise regulation of gene expression, which relies on the interaction between cis-acting elements in gene promoter regions and transcription factors (TFs)—specifically, TFs modulate gene expression by binding to cis-regulatory sequences located upstream of the 5′ end of target genes, either activating or repressing their transcription [37]. Therefore, stress-associated TFs frequently preferentially control genes with stress-responsive cis-elements in their promoters, allowing plants to quickly adjust gene expression in response to environmental stimuli [38,39,40,41,42], for instance, the TF HSFB2b, which has been shown to regulate the expression of GmC4H and GmCHS3 under salt stress by specifically binding to heat shock elements (HSEs) in their promoter regions [5]. Based on this regulatory paradigm, the enrichment of stress- and hormone-related cis-elements in MsPCO promoters, together with their strong transcriptional induction under specific abiotic stresses, aligns with the previously reported stress-responsive functions of the PCO family in other plant species [8]. This collectively suggests that the MsPCO family may contribute to alfalfa abiotic stress adaptation through the canonical cis-element-TF regulatory pathway, wherein stress-induced TFs likely bind to conserved cis-elements in MsPCO promoters to modulate their expression and downstream stress response cascades.
Tissue-specific gene expression patterns offer critical information for understanding the physiological functions of genes in planta [43], and our study revealed that all MsPCO genes exhibit distinct tissue-specific expression profiles—a characteristic consistent with the expression patterns of the PCO family reported in Mentha canadensis [44] and Petunia ancestors [45]. This cross-species similarity suggests that tissue-specific expression may represent a conserved evolutionary feature of the PCO family across diverse plant lineages. Further detailed analysis indicated that the majority of MsPCO genes are highly expressed in flower buds and mature flowers, implying their potential involvement in regulating alfalfa reproductive growth processes. Differential gene expression across various tissues is typically orchestrated by a complex interplay of multiple regulatory mechanisms, including chromosomal structure dynamics, histone modifications, transcription factor (TF) binding, and the coordinated action of cis-regulatory elements [25]. A complicated regulatory system explains why MsPCO gene expression levels can differ by several to more than a hundred times in different tissues and under different stress conditions. Together, these results support the notion that the MsPCO family functions as significant regulatory nodes in alfalfa. The family concurrently engages in the regulation of reproductive growth and adaptive responses to diverse abiotic challenges by synthesizing signals from developmental programs and environmental stress indicators via their meticulously regulated, differential expression patterns.
This integrative regulatory capacity is exemplified by the putative functions of the most stress-responsive family members, MsPCO19, MsPCO20 and MsPCO26. Although direct co-expression evidence linking these MsPCO with established stress marker genes in alfalfa is currently lacking in public transcriptomic datasets, their pronounced induction under multiple abiotic stresses strongly suggests that they act as upstream regulatory components within conserved stress-signaling networks. Given the evolutionarily conserved role of PCO enzymes in the oxygen-sensing mechanism that post-translationally regulates Group VII Ethylene Response Factor (ERF-VII) stability across diverse plant species [46], it is plausible to hypothesize that these highly inducible MsPCO modulate the activity of orthologous ERF-VII transcription factors in alfalfa. Such modulation is predicted to orchestrate the expression of a suite of downstream effector genes. This regulatory cascade likely operates through two primary mechanisms. First, it may initiate the expression of classical stress-responsive molecular markers. This is supported by the well-characterized paradigm of the DRE/DREB regulon in gene expression in response to drought and cold stress in plants such as A. thaliana and O. sativa [47]. Second, it may drive essential physiological acclimations, most notably osmotic adjustment, a process heavily reliant on the accumulation of compatible solutes like proline. The pivotal role of the P5CS1 gene, encoding a key rate-limiting enzyme in proline biosynthesis, within this adaptive pathway—and its confirmed efficacy in enhancing salinity tolerance when overexpressed—has been empirically validated in a crop system [48]. Therefore, the observed expression patterns of MsPCO19/20/26 are consistent with their proposed role as initiators of a transcriptional cascade that culminates in the coordinated expression of both early signaling components and terminal functional effectors. Future research employing transcriptome profiling following the genetic manipulation (e.g., overexpression or knockout) of these pivotal MsPCO genes will be essential to directly identify their downstream target genes and definitively elucidate their position within the alfalfa stress regulatory hierarchy.
Understanding the subcellular localization of proteins is critical for elucidating their biological functions, as correct targeting enables interaction with specific partners within defined organelles, thereby determining participation in specific metabolic or signaling cascades [49]. Consistent with this principle, the differential localization patterns observed among MsPCO family members suggest a functional division of labor, a strategy commonly associated with the adaptability of plant gene families to complex environments [50]. This specialization likely allows the family to mediate processes across distinct subcellular compartments, supporting both nuclear regulatory events and membrane-localized signaling to enhance stress tolerance [51].
Specifically, the exclusive nuclear localization of MsPCO19 aligns with the canonical role of PCO in oxidizing ERF-VII transcription factors, positioning it as a direct regulator of stress-responsive gene expression within the nucleus [52]. In contrast, the dual localization of MsPCO20 to the cell membrane and nucleus implies a more integrative, signaling role. Its membrane association may facilitate the perception of apoplastic or membrane-derived stress signals (e.g., ROS bursts), while its nuclear presence enables direct transcriptional modulation. This pattern resembles stimulus-dependent nucleo-cytoplasmic shuttling in signaling proteins, suggesting MsPCO20 could act as a bridge coupling early membrane signaling to downstream transcriptional reprogramming [53].
The functional implications of these localization patterns are further refined by the gene expression profiles, which are mechanistically rooted in their promoter architectures. The abundance of stress-responsive cis-elements (e.g., ABRE, DRE/CRT) provides a basis for their induction under corresponding abiotic stresses [54], while floral-specific elements may drive expression in reproductive tissues. For MsPCO19, its nuclear targeting and stress-induced expression converge to suggest a dedicated role in nuclear stress signaling. For MsPCO20, its dual localization, coupled with its specific expression pattern, hints at a role in integrating extracellular stress perception with nuclear responses. Together, the synthesis of promoter architecture, expression profiling, and subcellular localization elucidates how individual MsPCO members achieve functional diversification, collectively supporting alfalfa’s adaptation to environmental challenges.
In summary, our systematic expression analysis successfully identified two MsPCO genes, MsPCO19 and MsPCO20, as candidates highly responsive to drought, salinity, and cold stress. It is worth noting that subcellular localization results show that MsPCO19 protein is mainly localized on the cell membrane, while MsPCO20 exhibits dual localization on the cell membrane and nucleus. This difference may suggest that the two have different functional divisions in stress perception and signal transduction. Their conserved protein domains, significant and sustained induction patterns under various abiotic stresses, and potential regulatory functions in oxygen-sensing pathways collectively support their key role as upstream signaling elements for stress adaptation in alfalfa. Therefore, these two genes are recommended as high-priority targets for in-depth functional validation through genetic transformation and multi-omics methods, with the ultimate goal of cultivating new alfalfa germplasms with broad-spectrum stress resistance through molecular breeding.
4. Materials and Methods
4.1. Genome-Wide Identification and Characterization of the PCO Gene Family in Alfalfa
The genomic sequences, protein sequences, and annotation files of M. sativa L. cv. ‘Xinjiangdaye’ were obtained from the public database (https://figshare.com/projects/whole_genome_seq-uencing_and_assembly_of_Medicago_sativa/66380, accessed on 12 November 2024). The hidden Markov model (HMM) profile of the PCO domain was downloaded from the Pfam database [55]. To identify potential PCO family members in the alfalfa genome, bioinformatic screening was performed using TBtools software (version 2.364) [56], specifically employing its HMM search and BLAST functions.
4.2. Chromosomal Localization of Identified Genes
Chromosomal distribution patterns were determined through comprehensive analysis of genomic annotation files. The chromosomal localization of identified genes was visualized using the gene location visualize function within TBtools software, which processes GTF/GFF format annotation files [56].
4.3. Phylogenetic Relationship Reconstruction and Evolutionary Analysis
The PCO gene and protein sequences of A. thaliana, M. truncatula, G. max and O. sativa were retrieved from the TAIR database (http://www.arabidopsis.org, accessed on 13 November 2024) and the Ensembl Plants database (https://plants.ensembl.org, accessed on 13 November 2024). A phylogenetic tree was constructed using MEGA7 software (version 7.0.26) with the Neighbor-Joining method, aligning the identified MsPCO sequences with those of the four model species [57]. In comparison to their orthologs in model plants, the evolutionary divergence and homology of MsPCO genes were to be revealed by this comparative study.
4.4. Characterization of Protein Physicochemical Properties
The physicochemical properties of the identified MsPCO proteins, including molecular weight, instability index, aliphatic index, and grand average of hydropathicity (GRAVY), were calculated using the Protein Parameter Calc module in TBtools [56]. Gene locations and coding sequence lengths were determined through genomic sequence analysis. The theoretical isoelectric points (pI) were predicted using the ProtParam tool on the ExPASy server (https://web.expasy.org/protparam/, accessed on 15 November 2024). Subcellular localization was predicted using the WoLF PSORT online tool (https://wolfpsort.hgc.jp/, accessed on 15 November 2024).
4.5. Comprehensive Analysis of Amino Acid Motifs, Protein Conserved Domains, and Gene Structures
Protein motif identification was conducted using the MEME Suite online platform (http://meme-suite.org/, accessed on 16 November 2024), with subsequent visualization performed through the gene structure view (Advanced) module in TBtools software. For conserved domain analysis, protein sequences were submitted to the NCBI Batch CD-Search tool, and the resulting data were visualized using the same TBtools module [58]. Gene structure analysis was performed through comprehensive examination of gene annotation files, followed by visualization using TBtools gene structure view (Advanced). Multiple sequence alignment was executed using the MUSCLE Wrapper implemented in TBtools [59].
4.6. Protein Structure Prediction and Network Analysis
Protein sequences were submitted to NPS@: SOPMA secondary structure prediction (IBCP; https://npsa.lyon.inserm.fr/cgi-bin/npsa_automat.pl?page=npsa_sopma.html, accessed on 18 November 2024) for secondary structure analysis. The protein sequences were subsequently submitted to SwissModel (Swiss Bioinformatics Resource Portal; https://www.swissmodel.expasy.org/, accessed on 18 November 2024) for three-dimensional structure prediction [60]. To analyze the protein–protein interaction (PPI) network, resources from public PPI databases were utilized, following the methodology described by [61].
4.7. Genome-Wide Identification and Functional Annotation of Cis-Regulatory Elements
The genomic annotation files (GFF3 format) and corresponding genome sequences of M. sativa were processed using the GFF3 Sequences Extract module in TBtools software. The promoter regions (2000 bp upstream sequences) of each MsPCO gene were extracted using the TBtools Fasta Extract function. These sequences were submitted to the PlantCARE database to identify cis-regulatory elements. The distribution of elements was visualized using the TBtools Simple BioSequence Viewer [56]. A heatmap was generated from the annotation data to compare the abundance of different cis-acting element types across all MsPCO promoters.
4.8. Collinearity Analysis
Genomic data of alfalfa were analyzed with the TBtools Fasta Stats module to produce a chromosomal skeleton map. The chromosomal locations of MsPCO genes were ascertained by the analysis of gene annotation data utilizing the Table Row Extractor/Filter function in TBtools. For microsynteny analysis, gene segments encompassing MsPCO and their adjacent areas were prepared utilizing the File Transformatter for Micro Synteny Viewer tool. Collinear areas were detected and illustrated utilizing the Advanced Circos tool to elucidate conserved genomic blocks. To assess evolutionary selection pressures on duplicated MsPCO gene pairs, the non-synonymous substitution rate (Ka), synonymous substitution rate (Ks), and Ka/Ks ratios were computed utilizing the Simple Ka/Ks Calculator (NG) module in TBtools, based on nucleotide sequences, protein sequences, and gene pair files derived from intra-species collinearity analysis [62].
4.9. Expression Analysis of Alfalfa PCO Family Member Genes
The plant material used in this study was M. sativa L. cv. ‘Xinjiangdaye’. Seeds were surface-sterilized and germinated on moist filter paper in a light incubator (25 °C, 16 h light/8 h dark). After 5 days, uniform seedlings were transferred to an artificial climate chamber under identical conditions for hydroponic cultivation, irrigated every three days with half-strength Hoagland nutrient solution. Thirty-day-old seedlings were subjected to the following treatments, each with three biological replicates.
Abiotic stress treatments included salinity (200 mM NaCl), drought (15% PEG-6000), and low temperature (4 °C, 16 h light/8 h dark). Phytohormone treatments were applied using 100 µM solutions of abscisic acid (ABA), gibberellic acid (GA), salicylic acid (SA) and ethylene (ETH). For tissue-specific expression analysis, roots, stems, leaves, buds, flowers and pods were collected from mature plants grown in potting mixture (vermiculite:soil = 3:1) under controlled conditions (25 °C, 16 h light/8 h dark). For time-course experiments, leaf samples (approximately 0.1 g fresh weight per replicate) were collected at 0, 3, 6, 12, 24 and 48 h after treatment initiation. All samples were immediately flash-frozen in liquid nitrogen and stored at −80 °C.
Total RNA was extracted from frozen samples using TRIZOL reagent (Thermo Fisher Scientific, Waltham, MA, USA). RNA concentration and purity were assessed spectrophotometrically (A260/A280 ratio). First-strand cDNA was synthesized from 1 µg of total RNA using NovoScript Plus All-in-one 1st Strand cDNA Synthesis SuperMix. Fifteen MsPCO genes were selected for expression profiling. Gene-specific primers were designed with Primer 5.0 (sequences in Supplementary Table S5) and synthesized by Xinjiang Youkang Biotechnology Co., Ltd. (Urumqi, China) The MsActin gene (AA660796) was used as an internal reference [63].
qRT-PCR was performed on a Thermo Fisher QuantStudio system. Each 20 µL reaction contained 10 µL of 2× NovoScript fast SYBR qPCR SuperMix, 0.5 µL of each primer (10 µM), 1 µL of cDNA, and 8 µL of ddH_2_O. Thermal cycling conditions were 95 °C for 1 min; 40 cycles of 95 °C for 20 s and 60 °C for 1 min. Relative expression levels were calculated using the 2^−ΔΔCt^ method [64].
4.10. Subcellular Localization
Gene-specific primers were designed and synthesized by Urumqi Pengran Biotechnology Co., Ltd. (Urumqi, China) (Table S6). The coding sequences of selected PCO genes were amplified and cloned into the SacI and XbaI sites of the PCAMBIA2300-GFP vector. The resulting expression vectors were introduced into Agrobacterium tumefaciens strain EHA105 (pSoup) using the freeze–thaw method. Bacterial cultures were resuspended in infiltration buffer (10 mM MES, pH 5.6; 10 mM MgCl_2_; 100 µM acetosyringone) to an OD600 of 1.0. Agrobacterium carrying the empty PCAMBIA2300-GFP vector served as a control. Nicotiana benthamiana plants were grown in a potting mixture (vermiculite: soil = 3:1) in an artificial climate chamber at 23 °C under a 16 h light/8 h dark photoperiod. Five-week-old plants were infiltrated with the bacterial suspension using a syringe. Infiltrated leaves were kept in darkness for 24 h (23 °C), then transferred to normal growth conditions (23 °C, 16 h light/8 h dark) for 48 h before imaging. Fluorescence images were captured and processed using a Zeiss laser scanning confocal microscope and ZEN 3.11 software (blue edition) (Carl Zeiss AG, Oberkochen, Germany).
4.11. Data Processing
All quantitative gene expression data are presented as the mean ± standard deviation (SD) of three biological replicates. Statistical analysis was performed using SPSS 25 software. Differences among treatment groups were assessed by one-way analysis of variance (ANOVA), followed by Tukey’s honestly significant difference (HSD) post hoc test for multiple comparisons. A probability value of p < 0.05 was considered statistically significant. Data visualization and graph generation were conducted using GraphPad Prism 9.5 software.
5. Conclusions
This study presents the first genomic inventory of the PCO family in alfalfa, bridging a critical genomic knowledge gap of PCO genes in forage legumes in stress biology. We demonstrate that the MsPCO family has expanded evolutionarily and displays broad transcriptional responsiveness to key abiotic stresses, extending its functional paradigm beyond conserved hypoxia signaling. MsPCO19 and MsPCO20 are prioritized as core stress-responsive candidates, with their distinct subcellular localizations hinting at potential functional diversification. These findings establish a foundational framework and provide concrete genetic targets for future functional studies and molecular breeding aimed at enhancing abiotic stress resilience in alfalfa.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Giuntoli B. Perata P. Group VII ET Hylene Response Factors in Arabidopsis: Regulation and physiological roles Plant. Physiol.20181761143115510.1104/pp.17.0122529269576 PMC 5813551 · doi ↗ · pubmed ↗
- 2Zhu J. Abiotic stress signaling and responses in plants Cell 201616731332410.1016/j.cell.2016.08.02927716505 PMC 5104190 · doi ↗ · pubmed ↗
- 3Mittler R. Abiotic stress, the field environment and stress combination Trends Plant Sci.200611151910.1016/j.tplants.2005.11.00216359910 · doi ↗ · pubmed ↗
- 4Lu L. Wei W. Tao J. Lu X. Bian X. Hu Y. Cheng T. Yin C. Zhang W. Chen S. Nuclear factor Y subunit Gm NFYA competes with Gm HDA 13 for interaction with Gm FVE to positively regulate salt tolerance in soybean Plant Biotechnol. J.202119236223793426587210.1111/pbi.13668 PMC 8541785 · doi ↗ · pubmed ↗
- 5Bian X. Li W. Niu C. Wei W. Hu Y. Han J. Lu X. Tao J. Jin M. Qin H. A class B heat shock factor selected for during soybean domestication contributes to salt tolerance by promoting flavonoid biosynthesis New Phytol.202022526828310.1111/nph.1610431400247 · doi ↗ · pubmed ↗
- 6Taylor-Kearney L. Madden S. Wilson J. Myers W. Gunawardana D. Pires E. Holdship P. Tumber A. Rickaby R. Flashman E. Plant cysteine oxidase oxygen-sensing function is conserved in early land plants and algae ACS Bio Med Chem Au 2022252152810.1021/acsbiomedchemau.2c 00032 PMC 958551036281301 · doi ↗ · pubmed ↗
- 7Chen Z. Guo Q. Wu G. Wen J. Liao S. Xu C. Molecular basis for cysteine oxidation by plant cysteine oxidases from Arabidopsis thaliana J. Struct. Biol.202121310766310.1016/j.jsb.2020.10766333207269 · doi ↗ · pubmed ↗
- 8Bian X. Cao Y. Zhi X. Ma N. Genome-wide identification and analysis of the Plant Cysteine Oxidase (PCO) gene family in Brassica napus and its role in abiotic stress response Int. J. Mol. Sci.2023241124210.3390/ijms 24141124237511002 PMC 10379087 · doi ↗ · pubmed ↗
