Genome-wide identification and salt stress response analysis of the WD40 protein family in Populus yunnanensis
Yi Wu, Yude Kang, Lincui Shi, Aizhong Liu, Ping Li

TL;DR
This study identifies and analyzes WD40 proteins in Populus yunnanensis to understand their role in salt stress response.
Contribution
The study provides a comprehensive analysis of WD40 proteins in P. yunnanensis and their role in salt stress response.
Findings
258 PyWD40 proteins were identified in P. yunnanensis with diverse physicochemical properties.
Most PyWD40s were upregulated under salt stress, validated by qRT-PCR for 14 genes.
PyWD40s interact with proteins involved in signal transduction and co-express under stress.
Abstract
The WD40 proteins constitute a large regulatory family involved in a wide range of biological processes and stress responses. Advances in sequencing technologies have facilitated the identification of numerous WD40 proteins with diverse functions in many plant species. However, research on tree species remains limited. Populus yunnanensis is an economically and ecologically important tree species that faces salt stress in certain habitats. Therefore, studying WD40 proteins in P. yunnanensis is of particular significance for understanding its salt tolerance mechanisms. A variable number of WD40 proteins were identified across six poplar species, generally correlating with their genome sizes. In P. yunnanensis, 258 PyWD40s were identified, exhibiting considerable variation in amino acid number, and other physicochemical properties, suggesting potential functional diversity. The…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7- —the Open Fund of the First-Class Discipline Construction of Forestry, Southwest Forestry University, and the Open Project of the Key Laboratory of Forest Resources Conservation and Utilization in the
- —Yunnan Fundamental Research Projects
- —the National Natural Science Foundation of China
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPlant Molecular Biology Research · Plant Stress Responses and Tolerance · Plant Gene Expression Analysis
Introduction
The WD40 domain represents an ancient and extensive family of regulatory proteins in eukaryotes. This family is characterized by conserved core repeat units, each typically terminating with a Trp-Asp (WD) dipeptide sequence [1]. As one of the most abundant protein–protein interaction domains in eukaryotic genomes, the WD40 domain plays a vital role in mediating interactions between protein and DNA [2, 3]. Structurally, the domain assumes a β-propeller architecture, generally composed of seven blades, with each blade comprising four antiparallel β-strands [4, 5]. This domain folds into a β-propeller structure within proteins, offering a platform for the interaction and assembly of multiple protein partners, often serving as a scaffold with signalosome [3]. The β-propeller structure provides three distinct interaction surfaces (top, bottom, and circumference), enabling WD40 proteins to engage in reversible, multi-partner binding. A well-characterized example is the Gβ subunit of heterotrimeric G proteins, which forms a tightly associated dimer (Gβγ) with the Gγ subunit. Due to the central role of G proteins in transmembrane signaling, WD40 domains are frequently, though often misleadingly, labeled as “transducing-like” [6]. The functional repertoire of WD40 proteins encompasses a broad spectrum of critical eukaryotic processes. Certain WD-repeat proteins, functionally analogous to Gβγ, are integral to signal transduction cascades, serving as subunits of phosphatase [7], anchors for protein kinase C, or adaptors for receptor complexes [8]. The conserved three-dimensional structure common to WD40 family members underlies similar binding interfaces and functional capacities, implying evolutionary and mechanistic links to G-protein signaling pathways [9].
WD40 proteins constitute a functionally diverse group with roles spanning various organisms. In yeast, for instance, several WD-repeat proteins perform specific functions. The Saccharomyces cerevisiae protein Pfs2p forms multi-protein complexes essential for mRNA cleavage and polyadenylation [10]. Additionally, the WD-repeat proteins PWP2 and PEB1 are important for cell separation and the import of thiolase into peroxisomes [11, 12]. Futherthmore, WD40 repeat motifs are genetically related to those of the TPR (tetratricopeptide repeat) family, highlighting an evolutionary connection between these widespread repeat-protein families [12]. In plants, WD40 proteins exhibit similarly diverse functions. In Arabidopsis thaliana, the WD40 domain cyclophilin, CYP71 acts as a histone remodeling factor involved in chromatin-based gene silencing, contributing to gene repression and organogenesis regulation [13]. Arabidopsis XIW1, an XPO1-interacting WD40 protein, interacts with XPO1 and ABI5, functioning as a nuclear transport receptor that positively regulates the ABA (abscisic acid) response [14]. In rice, a WD40 protein OsKRN2 regulates rice grain number by controlling secondary panicle branching and exhibits a conserved interaction with DUF1644 in both maize and rice [15]. The rice WD40 gene OsTTG1 participates in biosynthetic pathways through interactions with numerous proteins, particularly transcription factors [16]. Similarly, in mango, the WD40 protein MiTTG1 interacts with MiMYB0, MiTT8, and MibHLH1 to form a ternary MYB-bHLH-WD40 complex, enhancing tolerance to mannitol, salt, and drought stress when expressed in transgenic Arabidopsis [17]. In peppers, interactions between WD40 proteins and CaAN1 or CaDYT1 implicate them in anthocyanin biosynthesis and male sterility [18]. In maize, the WD40 protein SHREK1 contributes to ribosome biogenesis and kernel development via interactions with other ribosomal proteins [19]. Moreover, WD40-repeat proteins can act as substrate recognition subunits for DDB1-CUL4 ubiquitin E3 ligase complexes, thereby influencing diverse biological processes [20]. Collectively, these examples underscore that protein–protein interactions are a fundamental mechanism through which WD40 proteins exert their wide-ranging biological functions across different organisms.
Extensive research on WD40 proteins across various plant species has primarily focused on their identification, quantification, and phylogenetic classification. In cereal crops, for example, 743 WD40 proteins were identified in wheat (Triticum aestivum L.) and classified into 5 clusters comprising 11 subfamilies [21]. Similarly, 164 HvWD40 proteins in barley (Hordeum vulgare L. var. nudum Hook. f, also known as Qingke) were grouped into 11 clusters and 14 subfamilies [22], 225 WD40 genes in foxtail millet (Setaria italica L.) were categorized into 5 subfamilies (I—V) [23], and approximately 200 potential OsWD40 genes were identified in rice [24]. Among solanaceous plants, 168 WD40 proteins in potato (Solanum tuberosum L.) were divided into 5 clusters (Cluster I-V) and 10 classes [25], while 207 WD40 genes in tomato (Solanum lycopersicum L.) were classified into 5 clusters and 12 subfamilies [26]. The number of WD40 proteins in different pepper (Capsicum annuum) species ranged from 237 to 257 [18]. In horticultural plants, 315 WD40 members were identified in mango (Mangifera indica L.) and further divided into 11 subgroups [17], and 191 WD—repeat (WDR) proteins in cucumber were classified into 21 subgroups [27]. For tree species, a more limited study identified 42 JrWD40s in walnut (Juglans regia L.), classifying them into 9 clusters [28].
WD40 repeat proteins are crucial components of eukaryotic genomes, participating in diverse developmental processes and environmental interactions in plants. A seminal study in maize delineated a nearly complete pan-genome WD40 repertoire, revealing substantial functional and genomic diversity and suggesting that gene duplications and Helitron transposon-mediated translocations have played key roles in the amplification of this gene family [29]. Collectively, these studies underscore the widespread distribution and functional versatility of WD40 proteins across the plant kingdom. However, the majority of this research has been conducted on herbaceous model and crop species, leaving a significant gap in our understanding of WD40 proteins in woody plants, particularly forest trees. Given the profound ecological and economic importance of forest trees, a deeper exploration of their WD40 protein family is essential. Populus yunnanensis, a tree species native to the low latitudes and high-altitude regions of southwest China, represents one such economically and ecologically important species [30–32]. In this context, our study aims to conduct the identification, bioinformatics analysis, and exploration of stress resistance of WD40 proteins in P. yunnanensis, which will contribute to a more comprehensive understanding of the functions and regulatory mechanisms of WD40 proteins in woody plants.
Results
WD40 proteins showed significant differences among poplar species
WD40 proteins were identified in Populus species via a homology-based search for the conserved WD40 domain. Subsequent validation using the NCBI CD-search and SMART databases confirmed the presence of WD40 proteins in six Populus species: 258 in P. yunnanensis, 250 in Populus trichocarpa, 427 in Populus tomentosa, 380 in Populus euphratica, 357 in Populus alba, and 270 in Populus deltoides (Table S1). This analysis revealed considerable variation in WD40 protein copy number across the examined Populus species.
We conducted a detailed analysis of the physicochemical properties of the 258 identified WD40 proteins in P. yunnanensis, designated PyWD40-1 to PyWD40-258 based on their gene IDs (which correspond to their chromosomal order; Table 1). These proteins exhibited substantial variation in amino acid length (103–3600 residues) and molecular weight (11,640.07 to 400,117.3 Da). The theoretical isoelectric points (pI) ranged from 4.32 to 9.68, classifying 80 as basic (pI > 7) and 178 as acidic (pI < 7). Using an instability index threshold of 40, we found that the number of unstable proteins (156; instability index > 40) was greater than that of stable ones (102; instability index < 40), with stability indices themselves panning a wide range (21.1 (PyWD40-164) to 64.67 (PyWD40-226)). The aliphatic index also varied considerably (49.96–103) further indicating diversity in protein stability. Most PyWD40 proteins were hydrophilic, with grand average of hydropathicity (GRAVY) values between −0.906 and −0.02. Only three proteins (PyWD40-24, PyWD40-85, PyWD40-235) were predicted to be hydrophobic, with positive GRAVY values of 0.128, 0.032, and 0.028, respectively. Subcellular localization predictions suggested diverse cellular roles, with 141 proteins localized to the nucleus, 55 to the chloroplast, 47 to the cytoplasm, 6 to the plasma membrane, 3 each to mitochondria and the cytoskeleton, 2 to the vacuole, and 1 to the peroxisome.Table 1. Physicochemical characteristics and subcellular location of P. yunnanensis WD40 proteinsProtein IDGene IDNumber of amino acidsDomainMolecular weight (Da)Isoelectric point (pI)Instability indexAliphatic indexGRAVYSubcellular localizationPyWD40-1Poyun0023041673–39745,403.818.5442.2786.9−0.108CytoplasmPyWD40-2Poyun00272765189–47085,126.378.7158.9549.46−0.872NucleusPyWD40-3Poyun0045734935–30438,776.828.641.9764.81−0.444NucleusPyWD40-4Poyun00467773488–77384,010.786.4242.1168.93−0.519NucleusPyWD40-5Poyun00710505381–462,289–46256,211.678.5152.8276−0.417NucleusPyWD40-6Poyun0074645815–24851,806.25.3546.3681.97−0.152CytoplasmPyWD40-7Poyun00814746265–58784,091.666.2250.2972.45−0.476NucleusPyWD40-8Poyun009001069287–1069118,842.495.845.3373.08−0.54NucleusPyWD40-9Poyun009431522560–715166,176.345.8342.287.81−0.041NucleusPyWD40-10Poyun01093530207–50658,358.979.2444.3678.94−0.269ChloroplastPyWD40-11Poyun01755892297–653,93–452,32–12399,738.356.3340.2186.45−0.206ChloroplastPyWD40-12Poyun0176835013–30038,003.55.5943.5273−0.364NucleusPyWD40-13Poyun0181991617–297103,483.674.9334.7984.72−0.343ChloroplastPyWD40-14Poyun020591129416–724,578–1020,939–1109124,493.637.8239.4283.99−0.23ChloroplastPyWD40-15Poyun0218494866–226106,280.455.0945.5289.51−0.271NucleusPyWD40-16Poyun0226535013–34538,687.194.8229.9573.8−0.295NucleusPyWD40-17Poyun02714113812–335122,779.255.0949.2479.78−0.277ChloroplastPyWD40-18Poyun0274021112–19023,155.49.0631.7868.39−0.317ChloroplastPyWD40-19Poyun0308734219–30537,888.488.5142.968.42−0.48CytoplasmPyWD40-20Poyun0318554728–36259,738.914.9532.590.71−0.24NucleusPyWD40-21Poyun0319330014–29933,110.186.4536.3580.5−0.286CytoplasmPyWD40-22Poyun0358750365–38655,923.218.4148.8872.13−0.454NucleusPyWD40-23Poyun035913424–30837,762.65.8649.7674.97−0.245CytoplasmPyWD40-24Poyun039544122–32443,587.655.6743.7995.150.128CytoplasmPyWD40-25Poyun0409790315–903101,202.925.7558.5477.56−0.391NucleusPyWD40-26Poyun04131969482–803106,421.565.4446.8771.08−0.616NucleusPyWD40-27Poyun04214429128–42947,705.067.5151.0181.96−0.307NucleusPyWD40-28Poyun04247371100–36541,681.465.7251.7881.91−0.142CytoplasmPyWD40-29Poyun0465932572–24135,606.687.0342.3775.6−0.206NucleusPyWD40-30Poyun0476431313–29535,193.676.1526.3577.57−0.397CytoplasmPyWD40-31Poyun04773445242–42550,036.415.0329.9665.71−0.533NucleusPyWD40-32Poyun04963675361–66975,183.696.2653.7976.87−0.45NucleusPyWD40-33Poyun050161098346–673,848–1059120,991.476.6337.1184.03−0.255ChloroplastPyWD40-34Poyun0509330817–29434,910.675.828.6580.65−0.227CytoplasmPyWD40-35Poyun05117902614–90298,926.186.3350.7968.58−0.649NucleusPyWD40-36Poyun05204479111–43254,685.388.8450.0774.68−0.375NucleusPyWD40-37Poyun0613115451121–1373173,898.676.4751.3286.6−0.235NucleusPyWD40-38Poyun06144501145–40355,285.765.7346.3472.61−0.549NucleusPyWD40-39Poyun0619932112–31934,271.745.9229.2180.56−0.128CytoplasmPyWD40-40Poyun06355548291–51058,689.487.6641.5472.03−0.341NucleusPyWD40-41Poyun06433959403–699,102–520106,432.086.8842.8587.92−0.311ChloroplastPyWD40-42Poyun06658128052–427,409–732139,1226.0243.4378.76−0.255NucleusPyWD40-43Poyun06783940397–722104,481.198.5853.4871.46−0.598NucleusPyWD40-44Poyun069731131399–692,763–1051124,694.116.5740.9580.05−0.326NucleusPyWD40-45Poyun07194574264–56263,513.735.3745.780.61−0.446NucleusPyWD40-46Poyun07580448254–42850,876.484.9534.4767.86−0.534NucleusPyWD40-47Poyun07710725384–72382,549.136.1742.4473.93−0.715NucleusPyWD40-48Poyun08098104311–361113,937.556.0236.3878.8−0.317NucleusPyWD40-49Poyun0818335549–35539,497.485.2439.1272.76−0.32NucleusPyWD40-50Poyun08219417168–39546,467.594.5831.8581.37−0.44CytoplasmPyWD40-51Poyun08357515201–50457,888.999.0543.9173.2−0.538NucleusPyWD40-52Poyun0842137757–36941,133.16.5231.8978.33−0.222NucleusPyWD40-53Poyun0845776112–29484,142.115.6634.9298.21−0.167CytoplasmPyWD40-54Poyun08477457106–42650,649.269.1136.8378.62−0.384ChloroplastPyWD40-55Poyun0947031861–31035,698.315.4637.5181.23−0.142CytoplasmPyWD40-56Poyun09610657370–65673,559.97.2745.673.65−0.498NucleusPyWD40-57Poyun1004454372–53962,547.926.2749.2783.11−0.388VacuolePyWD40-58Poyun10283780503–78085,723.156.6750.967.44−0.558NucleusPyWD40-59Poyun10360449250–42950,707.264.9543.6973.41−0.449CytoplasmPyWD40-60Poyun10513424124–40448,400.034.7244.4378.87−0.547NucleusPyWD40-61Poyun1052834696–29639,054.714.7651.7183.41−0.427CytoskeletonPyWD40-62Poyun1056034713–34238,688.174.9130.5673.31−0.343NucleusPyWD40-63Poyun10673442149–33848,349.336.2647.2775.23−0.311NucleusPyWD40-64Poyun1069138429–19342,005.855.3941.4776.15−0.32NucleusPyWD40-65Poyun108571402129–380156,523.675.9842.6891.19−0.118NucleusPyWD40-66Poyun11002744271–57882,425.995.8146.3575−0.343NucleusPyWD40-67Poyun1100825422220–2482,2444–2539279,553.246.2243.2888.24−0.077NucleusPyWD40-68Poyun1125629832254–2511325,592.515.4850.288.03−0.145Plas membranePyWD40-69Poyun1127743883–43147,628.479.0126.986.42−0.265ChloroplastPyWD40-70Poyun11451485120–44052,949.858.8439.1573.15−0.412NucleusPyWD40-71Poyun1151827650–15631,083.485.0943.790.4−0.097CytoplasmPyWD40-72Poyun1160081915–330,533–70490,811.96.4843.6683.08−0.267NucleusPyWD40-73Poyun116713268664–1084366,516.415.9543.5496.56−0.109Plas membranePyWD40-74Poyun1175010344–9211,640.075.448.8595.53−0.296ChloroplastPyWD40-75Poyun11838472275–45253,671.265.1828.1577.88−0.38ChloroplastPyWD40-76Poyun12238580261–56064,522.326.0447.3593.59−0.263NucleusPyWD40-77Poyun1228131416–31134,890.68.6129.2381.91−0.224MitochondrionPyWD40-78Poyun1232935953311–3591399,637.055.747.2893.45−0.133Plas membranePyWD40-79Poyun123631392–11215,045.118.7241.2477.99−0.284CytoplasmPyWD40-80Poyun12596444176–44448,790.037.5142.2293.06−0.022CytoplasmPyWD40-81Poyun1269934836–34838,748.274.9541.4880.17−0.287NucleusPyWD40-82Poyun1300915031235–1503167,765.035.8248.8388.89−0.241ChloroplastPyWD40-83Poyun1325836003317–3596400,117.35.7146.7893.69−0.137Plas membranePyWD40-84Poyun1329243229–24346,577.015.3226.6793.10.032ChloroplastPyWD40-85Poyun13342581262–56064,849.685.9845.0394.11−0.273NucleusPyWD40-86Poyun13718304118–29533,058.555.5132.5787.3−0.03ChloroplastPyWD40-87Poyun1380875841–331,634–70883,770.916.2756.4674.26−0.488NucleusPyWD40-88Poyun1390731712–31334,825.277.230.181.1−0.217PeroxisomePyWD40-89Poyun13953573220–53164,846.385.1327.2670.24−0.706NucleusPyWD40-90Poyun14057541179–49960,123.139.2949.8981.76−0.264VacuolePyWD40-91Poyun1423133221–31336,943.816.1931.9181.3−0.218CytoplasmPyWD40-92Poyun14619477161–40052,683.525.1447.7872.6−0.504NucleusPyWD40-93Poyun14762503157–47556,050.35.9251.5270.36−0.773ChloroplastPyWD40-94Poyun14796655208–52274,020.376.2649.7868.66−0.527NucleusPyWD40-95Poyun1489676134–36683,390.46.834.9983.89−0.285ChloroplastPyWD40-96Poyun14939989350–494107,334.55.5561.3675.55−0.307ChloroplastPyWD40-97Poyun15045382202–279,33–27942,744.998.6337.4886.44−0.115NucleusPyWD40-98Poyun151011127340–659,835–1086124,597.597.1438.4283.43−0.269ChloroplastPyWD40-99Poyun1532661512–22967,335.925.7936.8891.32−0.242NucleusPyWD40-100Poyun15420480162–45753,023.119.1140.1371.08−0.464NucleusPyWD40-101Poyun15512474115–45553,337.847.644.6677.3−0.462NucleusPyWD40-102Poyun1577043399–41847,527.029.0637.8294.97−0.211ChloroplastPyWD40-103Poyun15907630275–57471,113.356.8738.3267.51−0.559NucleusPyWD40-104Poyun15922524226–51356,979.226.1627.1282.46−0.281NucleusPyWD40-105Poyun162743262–31435,629.146.3733.3278.04−0.311ChloroplastPyWD40-106Poyun162763018–28532,641.475.8624.3480.37−0.255ChloroplastPyWD40-107Poyun1646483357–444,377–83091,928.416.0440.385.91−0.14NucleusPyWD40-108Poyun166871713222–631192,306.185.9748.5166.05−0.736NucleusPyWD40-109Poyun1679716741257–1672184,584.26.4849.7291.62−0.124NucleusPyWD40-110Poyun17261485189–46152,710.159.2543.8873.11−0.42NucleusPyWD40-111Poyun17550385116–30742,495.66.1836.7885.82−0.142ChloroplastPyWD40-112Poyun17551537122–34460,054.49.5326.7776.67−0.528CytoplasmPyWD40-113Poyun17794505207–50156,954.955.3756.27103.27−0.114CytoplasmPyWD40-114Poyun178057489–24581,705.826.8933.5784.43−0.26ChloroplastPyWD40-115Poyun17872727386–72582,829.46.3145.0471.72−0.734NucleusPyWD40-116Poyun18136459102–364,327–45650,493.655.649.3280.46−0.375ChloroplastPyWD40-117Poyun18265463117–44451,128.168.9445.6684.04−0.193ChloroplastPyWD40-118Poyun1831691117–297102,948.154.9236.5585.28−0.323ChloroplastPyWD40-119Poyun1835635013–29937,908.295.443.0372.69−0.347NucleusPyWD40-120Poyun18365889294–650,32–41099,177.555.8839.3185.11−0.187ChloroplastPyWD40-121Poyun18967492143–46853,805.399.2546.9874.35−0.404ChloroplastPyWD40-122Poyun191051513550–715,16–120165,387.555.9542.9186.73−0.072NucleusPyWD40-123Poyun191491110328–1110122,761.655.441.9574.48−0.482NucleusPyWD40-124Poyun19218746266–58883,749.246.3644.2675.32−0.448NucleusPyWD40-125Poyun1927645815–24851,732.085.3243.6982.6−0.143CytoplasmPyWD40-126Poyun19533785500–78585,404.156.4443.8867.75−0.544NucleusPyWD40-127Poyun19544501187–45655,162.459.2243.3966.79−0.423NucleusPyWD40-128Poyun19713757184–46583,616.698.0557.9353.3−0.783NucleusPyWD40-129Poyun19717750177–45883,359.458.7155.1351.08−0.839NucleusPyWD40-130Poyun1975741975–39945,795.368.8139.8882.1−0.111CytoplasmPyWD40-131Poyun19991445242–42550,292.64.9231.1267.24−0.58NucleusPyWD40-132Poyun200131359192–507147,883.875.4254.0385.94−0.276ChloroplastPyWD40-133Poyun20069512202–51257,738.996.8141.982.46−0.336NucleusPyWD40-134Poyun20399942455–776103,625.185.8247.3968.34−0.656NucleusPyWD40-135Poyun2043590515–905101,593.75.4557.279.88−0.339NucleusPyWD40-136Poyun2079233731–16036,525.619.4635.8188.19−0.166ChloroplastPyWD40-137Poyun209203752–37540,592.278.0835.1583.92−0.145CytoplasmPyWD40-138Poyun2094331849–30935,746.315.4636.7379.37−0.171ChloroplastPyWD40-139Poyun2111799029–293108,835.268.2448.5380.7−0.324Plas membranePyWD40-140Poyun2113843568–38348,858.774.7446.7265.72−0.511NucleusPyWD40-141Poyun21465569103–56564,203.456.0153.9778.12−0.426ChloroplastPyWD40-142Poyun215748089–24588,255.827.2936.3982.38−0.31ChloroplastPyWD40-143Poyun21578529221–51559,811.195.545.6196.35−0.198CytoplasmPyWD40-144Poyun21817557122–35762,226.339.6827.3883.38−0.442CytoplasmPyWD40-145Poyun2181838754–31142,655.695.3638.6885.09−0.163NucleusPyWD40-146Poyun22099485191–46152,799.329.3742.9373.71−0.422NucleusPyWD40-147Poyun226081729185–644193,966.386.2849.1265.76−0.72NucleusPyWD40-148Poyun2296630119–28532,929.625.3623.5677.08−0.359ChloroplastPyWD40-149Poyun229673268–31435,738.256.5535.1777.45−0.352ChloroplastPyWD40-150Poyun23268525226–51457,182.646.2225.6684.34−0.266NucleusPyWD40-151Poyun23280581282–58165,676.746.3339.5265.3−0.693NucleusPyWD40-152Poyun23525389112–32643,897.327.253.9574.68−0.492NucleusPyWD40-153Poyun23794465157–44251,245.289.2637.2877.14−0.378NucleusPyWD40-154Poyun2379613331039–1118150,669.626.948.490.05−0.275NucleusPyWD40-155Poyun240871124337–656,440–947,832–1083124,082.987.0437.7283.83−0.27ChloroplastPyWD40-156Poyun24149382202–279,55–27942,479.497.6334.3487.96−0.083CytoplasmPyWD40-157Poyun24241989372–408,353–494107,146.355.657.3175.65−0.307ChloroplastPyWD40-158Poyun2428679634–36887,341.977.5636.583.61−0.266MitochondrionPyWD40-159Poyun24500455154–42950,469.518.2450.5774.22−0.547ChloroplastPyWD40-160Poyun2478686826–61195,289.495.940.9789.75−0.118NucleusPyWD40-161Poyun248051136350–659,812–1103125,050.986.6343.2780.72−0.299NucleusPyWD40-162Poyun2518838953–38641,603.744.3835.4281.7−0.239CytoplasmPyWD40-163Poyun252811532–15317,271.87.7721.170.72−0.487CytoplasmPyWD40-164Poyun25717651179–49871,677.185.3545.8268.33−0.653NucleusPyWD40-165Poyun25719519212–50958,475.465.5544.488.98−0.258NucleusPyWD40-166Poyun25852424110–40348,406.994.742.5277−0.562NucleusPyWD40-167Poyun26057672110–40375,830.77.0847.8875.57−0.477NucleusPyWD40-168Poyun26169443249–42350,078.885.1837.2671.51−0.46NucleusPyWD40-169Poyun2622637817–23242,369.747.5637.375.77−0.27NucleusPyWD40-170Poyun2623534696–29639,038.754.7551.4284.83−0.403CytoskeletonPyWD40-171Poyun26281488131–46954,966.796.9944.4873.48−0.411NucleusPyWD40-172Poyun2628753453–33859,462.519.4145.8788.46−0.268NucleusPyWD40-173Poyun26326441146–33548,102.055.849.6981.79−0.247NucleusPyWD40-174Poyun2634337829–18841,470.275.2248.5572.96−0.327NucleusPyWD40-175Poyun264321611233–614,577–661180,839.936.8346.6670.97−0.652NucleusPyWD40-176Poyun265781930483–540,421–532213,308.65.6146.6388.89−0.248NucleusPyWD40-177Poyun266831051251–1051116,922.56.8251.0774.02−0.483NucleusPyWD40-178Poyun26856547248–52959,924.917.831.2180.02−0.246ChloroplastPyWD40-179Poyun2691429952249–2526327,291.35.5748.8889.32−0.166Plas membranePyWD40-180Poyun2694743579–42747,344.099.0227.9486.55−0.284ChloroplastPyWD40-181Poyun27183484119–43952,860.68.6942.6372.5−0.413NucleusPyWD40-182Poyun272794509–32049,600.46.4348.6190.33−0.188NucleusPyWD40-183Poyun2738745238–43349,680.968.9936.1977.68−0.38NucleusPyWD40-184Poyun2740381815–330,532–70390,257.456.9543.9583.67−0.261NucleusPyWD40-185Poyun2763154722–36359,996.945.1534.9184.11−0.34NucleusPyWD40-186Poyun27807455154–42950,492.618.5247.2173.36−0.552ChloroplastPyWD40-187Poyun280717704–505,505–54385,458.755.9436.0389.87−0.111NucleusPyWD40-188Poyun28315833533–82992,260.865.836.2588.85−0.179CytoplasmPyWD40-189Poyun2854739054–38741,803.654.3236.2978−0.324CytoplasmPyWD40-190Poyun2862934547–34238,381.517.1839.1368.35−0.466NucleusPyWD40-191Poyun2865931547–31234,989.666.5935.570.54−0.443NucleusPyWD40-192Poyun2897445259–36051,680.739.4845.0568.83−0.594NucleusPyWD40-193Poyun29223658208–52174,084.185.9644.9969.38−0.509NucleusPyWD40-194Poyun292464513–35749,430.217.2342.2285.9−0.166NucleusPyWD40-195Poyun29250503157–47556,166.76.2956.1771.33−0.733NucleusPyWD40-196Poyun29381481184–40452,755.525.0152.272.83−0.474NucleusPyWD40-197Poyun29891575222–53364,907.515.3325.3668.99−0.722NucleusPyWD40-198Poyun30065467100–42651,639.49.3949.9874.09−0.434MitochondrionPyWD40-199Poyun30228489208–47453,324.494.5242.4781.96−0.384NucleusPyWD40-200Poyun3033131732–31334,789.176.5833.8874.95−0.254CytoplasmPyWD40-201Poyun3040636538–30439,642.484.8949.2378.3−0.132CytoskeletonPyWD40-202Poyun3047639458–38544,065.125.9746.0467.69−0.498CytoplasmPyWD40-203Poyun30533452159–44348,892.77.2128.772.39−0.297NucleusPyWD40-204Poyun30780428159–40348,839.39.174776.99−0.592NucleusPyWD40-205Poyun30822577258–56864,320.918.8860.4580.69−0.569NucleusPyWD40-206Poyun3094973723–34081,213.927.2847.192.86−0.099ChloroplastPyWD40-207Poyun31013509217–50856,133.315.836.8687.41−0.316NucleusPyWD40-208Poyun3119040661–40044,962.818.0436.3790.99−0.288NucleusPyWD40-209Poyun3120012189–317136,606.296.5833.2291.73−0.201ChloroplastPyWD40-210Poyun312133018–28532,719.525.6722.4280.33−0.275CytoplasmPyWD40-211Poyun314251208849–1145134,170.985.9345.0789.92−0.169ChloroplastPyWD40-212Poyun315093288–32335,903.636.9925.7489.18−0.124ChloroplastPyWD40-213Poyun3151360966–60766,137.936.2630.1888.64−0.153ChloroplastPyWD40-214Poyun31648489208–47453,219.354.5441.7281.57−0.38ChloroplastPyWD40-215Poyun31814723270–57980,860.565.5848.8767.37−0.507NucleusPyWD40-216Poyun3203845264–36251,657.719.3944.271.39−0.579NucleusPyWD40-217Poyun3260934019–30737,510.255.8148.5277.12−0.234NucleusPyWD40-218Poyun32678494137–45054,064.446.6733.8783.1−0.163NucleusPyWD40-219Poyun32813439115–43248,181.495.537.5789.48−0.19CytoplasmPyWD40-220Poyun33524722473–69078,759.827.0729.3381.11−0.05NucleusPyWD40-221Poyun335451883–15621,013.815.2527.676.17−0.219NucleusPyWD40-222Poyun33617675361–66975,234.676.3251.8778.18−0.457NucleusPyWD40-223Poyun336511092306–642,783–1016121,126.135.2741.6485.93−0.232ChloroplastPyWD40-224Poyun336521128421–688,836–1089124,607.236.7239.5385.25−0.207ChloroplastPyWD40-225Poyun33766987699–987110,019.416.5664.6760.2−0.906NucleusPyWD40-226Poyun34148503150–40855,378.925.9448.3371.73−0.573NucleusPyWD40-227Poyun346481125348–659,814–1045123,920.316.7539.2179.8−0.321ChloroplastPyWD40-228Poyun35174459163–43451,058.328.7350.2276.97−0.506NucleusPyWD40-229Poyun35523557307–51959,525.336.9842.6970.68−0.337NucleusPyWD40-230Poyun35706449154–42450,188.488.2646.2274.1−0.48NucleusPyWD40-231Poyun358703167–28835,359.736.0337.1576.52−0.315CytoplasmPyWD40-232Poyun3594020720–17122,643.115.142.2675.94−0.229NucleusPyWD40-233Poyun3594818420–16520,201.324.750.1275.82−0.246NucleusPyWD40-234Poyun36448875283–45296,898.736.6547.9694.460.028*ChloroplastPyWD40-235Poyun36485728273–58481,324.875.3547.9372.01−0.466NucleusPyWD40-236Poyun3677860966–60765,990.846.1927.0990.08−0.128CytoplasmPyWD40-237Poyun367833288–32336,045.887.627.2487.68−0.138CytoplasmPyWD40-238Poyun368841207848–1144133,938.465.9743.5289.66−0.179NucleusPyWD40-239Poyun3694396827–300106,491.966.0938.3381.49−0.344NucleusPyWD40-240Poyun3704012209–317136,797.566.6233.0991.11−0.202ChloroplastPyWD40-241Poyun3705440358–39744,455.917.0235.190.94−0.336NucleusPyWD40-242Poyun37368582263–57364,813.469.2460.6481.36−0.535NucleusPyWD40-243Poyun37635450157–44148,899.927.2228.9273.78−0.295NucleusPyWD40-244Poyun3768839064–38543,448.35.7445.7670.15−0.492CytoplasmPyWD40-245Poyun378403879–38241,943.215.9636.180.39−0.305CytoplasmPyWD40-246Poyun3784395064–224107,037.415.142.6189.71−0.285CytoplasmPyWD40-247Poyun38349113712–335122,832.025.1353.2578.21−0.299CytoplasmPyWD40-248Poyun3837347197–46952,121.48.5530.8680.49−0.332CytoplasmPyWD40-249Poyun3868834219–30538,127.678.0441.168.42−0.495CytoplasmPyWD40-250Poyun38885458107–42850,658.769.2744.3272.93−0.503NucleusPyWD40-251Poyun3891176112–29484,073.215.4835.3998.09−0.15CytoplasmPyWD40-252Poyun3894337757–36940,899.96.6929.4681.17−0.173NucleusPyWD40-253Poyun3900333012–30236,918.579.0229.4688.58−0.02CytoplasmPyWD40-254Poyun39136418168–39546,887.994.632.8383.23−0.468CytoplasmPyWD40-255Poyun3917649628–40654,969.745.5346.4176.85−0.39ChloroplastPyWD40-256Poyun39226104211–361113,935.586.2836.4279−0.33NucleusPyWD40-257Poyun39333543155–47260,565.248.8438.8984.88−0.262CytoplasmThe proteins PyWD40-8, PyWD40-25, PyWD40-58, PyWD40-124, PyWD40-136, PyWD40-142, PyWD40-168, PyWD40-178, PyWD40-203, and PyWD40-245 are reserved for the PLN00181 domain; the proteins PyWD40-69 and PyWD40-180 are reserved for the Beach domain; the protein PyWD40-74 is reserved for the Neurobeachin domain; the protein PyWD40-133 is reserved for the Ge1_WD40 domain. * Hydrophobic protein (GRAVY > 0). Proteins with a negative GRAVY value are hydrophilic and unmarked
Phylogenetic analysis and sub-family classification of WD40 in P. yunnanensis
To elucidate the evolutionary relationships of WD40 proteins in P. yunnanensis, a phylogenetic tree was constructed using sequences from 258 P. yunnanensis and 230 A. thaliana WD40 proteins (Fig. 1). The proteins from both species clustered into eight major evolutionary clades. Following the established subfamily classification for A. thaliana, the P. yunnanensis WD40 proteins were correspondingly classified into eight subfamilies (I- VIII), containing14, 19, 13, 39, 49, 57, 34, and 33 members, respectively (Table S2).Fig. 1. Phylogenetic analysis of the WD40 proteins from P. yunnanensis and A. thaliana. Protein sequences of WD40 from P. yunnanensis and A. thaliana were aligned using MAFFT and FastTree. The tree is rooted using Chlamydomonas reinhardtii (Cre06.g302750_4532, indicated by purple dots) as the outgroup. Protein origins are marked as follows: red dots, A. thaliana; blue dots, P. yunnanensis; purple dots, C. reinhardtii. Branch support values are based on 1000 bootstrap replicates
Uneven distribution and collinearity analysis of WD40 on chromosomes in P. yunnanensis
The 258 PyWD40 genes were unevenly distributed across all 19 chromosomes of P. yunnanensis (designated LG01–LG19, where ‘LG’ denotes Linkage Group; Fig. 2A). Chromosome LG01 harbored the highest number (31 genes), followed by LG04 (24 genes) and LG11 (21 genes). LG05 contained 17 gens, while LG02, LG06, and LG07 each contained 16. The remaining chromosomes (LG03, LG08, LG09, LG10, LG12, LG13, LG14, LG16, LG17, LG18, LG19) contained 11, 11, 13, 9, 7, 10, 14, 8, 6, 11, and 13 genes, respectively. Chromosome LG15 contained the fewest, with only four PyWD40 genes.Fig. 2. Chromosomal location and collinearity analysis of WD40 genes from P. yunnanensis. A Chromosomal location of WD40 genes in P. yunnanensis. The chromosome-level scaffolds are designated as “LG” (Linkage Group) according to the reference genome assembly. The green bar plot on the right displays the number of WD40 genes located on each chromosome (LG). B Collinearity analysis of WD40 genes from P. yunnanensis and A. thaliana. The blue lines indicate the collinearity relationship between P. yunnanensis and A. thaliana WD40s. The gray lines represent the collinearity relationships of all the genes in the genome of P. yunnanensis and A. thaliana. Violet boxes depict chromosomes of P. yunnanensis, yellow boxes depict chromosomes of A. thaliana. Chromosome numbers are indicated beside the boxes. C Intra-genomic collinearity analysis of WD40 genes in P. yunnanensis. The red lines indicate the collinearity relationship among P. yunnanensis WD40s. The gray lines represent the collinearity relationships of all the genes in the P. yunnanensis genome
Collinearity analysis revealed 174 homologous pairs between P. yunnanensis and A. thaliana WD40 genes (Fig. 2B, Table S3). Furthermore, intra-genomic analysis demonstrated strong sequence homology and conservation among P. yunnanensis WD40s identifying 114 segmental duplication pairs and one tandem duplication pair (PyWD40-221 and PyWD40-222) (Fig. 2C, Table S4). The Ka/Ks ratios of all WD40 gene pairs were all less than 1 (ranging from 0.0184 to 0.449), indicating that this gene family has undergone purifying selection and exhibits high sequence conservation (Table S5).
Structural analysis of the three-dimensional models of WD40 proteins in P. yunnanensis
To investigate sequence and structural features, representative WD40 proteins from the eight subfamilies were subjected to three-dimensional modeling. Members of each subfamily were successfully matched to a distinct GRAS protein template (Fig. 3): Subfamily I (PyWD40-28) to A0A6M2E7Y9.1.A; Subfamily II (PyWD40-65) to B9GS95.1.A; Subfamily III (PyWD40-216) to A0A6M2EW64.1.A; Subfamily IV (PyWD40-137) to A0A2K1Z0D2.1.A; Subfamily V (PyWD40-55) to A0A2K1Z638.1.A; Subfamily VI (PyWD40-99) to I1NJ39.1.A; Subfamily VII (PyWD40-85) to A0A7J7BYT5.1.A; and Subfamily VIII (PyWD40-64) to A0A4U5Q2H2.1.A..Fig. 3. Protein 3D protein structure models and sequence similarity analysis of P. yunnanensis WD40s. A Predicted 3D structure models of the representative proteins from eight P. yunnanensis WD40 subfamilies. The models for PyWD40-28 (subfamily I), PyWD40-65 (subfamily II), PyWD40-216 (subfamily III), PyWD40-137 (subfamily IV), PyWD40-55 (subfamily V), PyWD40-99 (subfamily VI), PyWD40-85 (subfamily VII), and PyWD40-64 (subfamily VIII) were constructed based on the templates A0A6M2E7Y9.1.A, B9GS95.1.A, A0A6M2EW64.1.A, A0A2K1Z0D2.1.A, A0A2K1Z638.1.A, I1NJ39.1.A, A0A7J7BYT5.1.A, and A0A4U5Q2H2.1. B Heatmap depicting the sequence similarity between the P. yunnanensis WD40 proteins and their respective modeling templates. The color scale and numerical values indicate the percentage of sequence identity: 99.73%, 99.22%, 86.45%, 70.10%, 85.09%, 84.94%, 73.77%, and 97.02% respectively. GMQE values for each model: 0.95, 0.86, 0.62, 0.60, 0.77, 0.77, 0.86, and 0.79, respectively
Integrated sequence and structural analysis revealed that all modeled proteins were rich in β-sheet structures (Figure S1). The specific secondary structure composition for each model was as follows: Subfamily I: 31 β-sheets, 1 α-helix, 3 η-helices; Subfamily II: 30 β-sheets, 3 α-helices, 1 η-helix; Subfamily III: 37 β-sheets, 3 α-helices, 4 η-helices; Subfamily IV: 18 β-sheets, 1 η-helix; Subfamily V: 32 β-sheets, 1 α-helix, 3 η-helices; Subfamily VI: 62 β-sheets, 13 α-helices, 7 η-helices; Subfamily VII: 28 β-sheets, 7 α-helices; Subfamily VIII: 32 β-sheets.
Structure analysis of the gene and protein sequences of P. yunnanensis WD40
To further characterize the WD40 family, we analyzed their gene structures, conserved protein motifs, and domains (Fig. 4, Figure S2, Table S6). Beyond the core WD40 domain, the proteins harbored diverse auxiliary domains, the proteins harbored diverse auxiliary domains, which were classified into 27 distinct classes (Table 2). The majority of proteins (143) contained only the core WD40 domain (class 1). Numerous other proteins contained domains associated with specific functions, including metabolism (BOP1NT), vesicular transport and protein sorting (Coatomer_WDAD, COPI_C, Beach, RING_Ubox, Vps8, Clathrin), transcriptional regulation (Bromodomain, PHA03151), enzyme activity or recruitment (PH-like, Med15, BCAS3, F-box, RING_Ubox), signal transduction (CLH, PKc_like), and miRNA splicing (Prp19).Fig. 4. Sequence and structural features of representative WD40 proteins from P. yunnanensis. The figure displays an integrated analysis of (A) conserved motifs, (B) protein domains and (C) gene structure for representative members of the eight WD40 subfamilies (I-VIII) in P. yunnanensis. In panel A, different colored boxes represent distinct conserved motifs identified by MEME suite analysis. In panel B, different colored boxes indicate various conserved protein domains, as listed on the right. In panel C, green boxes represent exons, black lines represent introns, and yellow boxes represent untranslated regions (UTRs). The comprehensive figure for all 258 WD40 members is provided in Supplementary Figure S2Table 2Statistical table of domain types for P. yunnanensis WD40ClassesDomain compositionGene numberClass 1OnlyWD40143Class 2WD40 + CTLH or CTLH + LisH or CTLH + LisH_TPL or CTLH + LisH + Atrophin-116Class 3PLN00181 or PLN00181 + RING_Ubox10Class 4WD40 + CAF1C_H4-bd9Class 5WD40 + LisH or LisH + Med157Class 6WD40 + RING_Ubox or RING_Ubox + CLH or RING_Ubox + PKc_like or RING_Ubox + Vps8 + Clathrin or RING_Ubox + Prp19 or RING_Ubox + Prp19 + COG49137Class 7WD40 + Utp12 or Utp12 + ANAPC4_WD40 or Utp13 or Utp13 + PLN00181 or UTP15_C7Class 8WD40 + Coatomer_WDAD or Coatomer_WDAD + COPI_C4Class 9NBCH_WD40 + Beach + DUF4704 + PH-like or NBCH_WD40 + Beach + DUF4704 + PH-like + Neurobeachin + DUF4800 + Laminin_G_33Class 10WD40 + Beach + PKc_like or Beach + PH-like + Laminin_G_3 + DUF47043Class 11WD40 + Bromodomain or Bromodomain + PHA031513Class 12WD40 + Katanin_con80 or Katanin_con80 + Herpes_BLLF13Class 13WD40 + Med153Class 14WD40 + BCAS32Class 15WD40 + BING4CT2Class 16WD40 + BOP1NT or BOP1NT + DMP12Class 17WD40 + DENN + uDENN + dDENN2Class 18WD40 + F-box_SF or F-box-like2Class 19WD40 + Hira2Class 20WD40 + NLE2Class 21WD40 + Nup1602Class 22WD40 + PUL + PFU2Class 23WD40 + Sof12Class 24WD40 + TAF5_NTD22Class 25WD40 + Ubl1_cv_Nsp3_N-like2Class 26WD40 + zf-CCCH_3 or zf_CCCH_42Class 27Other14
All identified 258 proteins contain characteristic WD repeats. Domain architecture analysis specified that members of subfamilies I, II, III, IV, V, VI, VII, and VIII were annotated with the WD40 superfamily domain (PF00400), while the remaining members (largely in subfamily VI) were annotated with the PLN00181 domain, a related WD40-like superfamily member in Pfam. Notably, the LisH domain (Lissencephaly type-1-like homology) was identified in five members of subfamily I, two of subfamily IV, ten of subfamily VI and one of subfamily V. The Med15 domain was detected in PyWD40-36 (subfamily I); the CTLH (C-terminal to LisH) domain was detected in five of subfamily II; ten of subfamily VI and one of subfamily V. The RING-Ubox domain was present in two members of subfamily II, one of subfamily V, five of subfamily VI, and one of subfamily VIII.
Motif analysis indicated that no single motifs was common to all WD40 members. However, most of the identified motifs (1, 2, 4, 5, 7, 8, 10) contained the characteristic Trp (W) and Asp (D) residues (Table S6). Motif 1, 2, 3, 4, 5, 7, 9, and 10 were the most prevalent across the subfamilies, with distinct distribution patterns (e.g., Motif 1 was prevalent in subfamily I, II, III, IV, VI; Motif 2 in I, III, IV, V, VI, and VIII; Motif 3 in II, III, V, VIII; Motif 4 in I, II, III, IV, VIII; Motif 5 in I, VI, VIII; Motif 7 in most members of subfamily I, II, III, IV, V, VII, VIII; Motif 9 in most members of subfamily I, II, III, IV, VIII; Motif 10 in most members of subfamily II, III, VII).
The exon–intron structures of P. yunnanensis WD40 genes were highly diverse, with exon numbers ranging from 1 to 40 (Table S6). This structural heterogeneity was evident within subfamilies. For instance, Subfamily V and VI contained the most complex genes, with maximum exon counts of 25 and 32, respectively. In contrast, genes in Subfamily III predominantly had simpler architectures, with 7 to 12 exons. This variation suggests evolutionary divergence in gene structure among the WD40 subfamilies.
Abundant cis-acting elements implied the functions and regulatory mechanisms of WD40 in P. yunnanensis
To investigate the potential roles of P. yunnanensis WD40 genes in stress responses and their transcriptional regulation, we analyzed cis-acting elements within the 2-kb promoter regions upstream of each gene (Fig. 5, Table S7). A diverse array of elements was identified and categorized into three functional classes: stress-responsive (e.g., anaerobic induction, drought, low-temperature, defense), hormone-responsive (ABA, SA, GA, MeJA, auxin), and growth/development-related (light response, circadian control, protein-binding sites). Although prevalent, the abundance and combination of these elements varied markedly across subfamilies, providing a basis for functional distinction. Subfamilies II and III displayed minimalistic profiles, predominantly containing only light-responsive elements, or a combination of light-responsive and anaerobic induction elements, respectively. In contrast, Subfamilies IV-VIII possessed complex profiles, with most members harboring elements responsive to ABA, MeJA, anaerobic induction, and growth and development. Gibberellin (GA)-responsive elements were a distinctive feature exclusive to most members of Subfamilies IV and V. Subfamily I presented an intermediate and heterogeneous profile; most members contained a core set of light-responsive, anaerobic induction, and ABA-responsive elements, alongside secondary stress/hormone-related elements, while a few exceptions (PyWD40-43, PyWD40-59, PyWD40-226) lacked this core set, indicating intra-subfamily functional divergence.Fig. 5. Classification and abundance of cis-acting elements in the promoters of P. yunnanensis WD40 genes. Heatmap depicting the number of cis-acting elements in the 2 kb promoter regions of representative WD40 genes from eight subfamilies (I-VIII). The elements are grouped into three major functional categories: Stress Response, Hormone Response, and Growth & Development. The color scale on the right represents the normalized count or abundance of each element type, with colors ranging from white (low abundance) to red (high abundance)
Specific, less common cis-elements were also identified in subsets of genes, including those for anoxic-specific inducibility, maximal elicitor-mediated activation, conserved DNA module arrays (CMA3), sequence conserved in alpha-amylase promoter sand wound responsiveness (Table S7).
Expression profile analysis and verification of expression status of WD40 genes in P. yunnanensis under salt stress
Transcriptional profiles of WD40 genes were analyzed under control (CK), short-term low-concentration (25 mM NaCl, T1), and long-term high-concentration (75 mM NaCl, T4) salt stress conditions using existing RNA-seq data (Table S8). Most WD40 family members were upregulated under salt stress (Fig. 6A). While a small subset was downregulated or unchanged, the predominant upregulation indicates broad involvement in the salt stress response. Expression patterns also differed among subfamilies. Most members of subfamilies I and II, and the majority in IV-VIII, showed significant activation. Notably, the expression of many genes was basally low (FPKM < 100). Conversely, some highly expressed genes (FPKM > 100) were downregulated under salt stress, including PyWD40-140 (IV), PyWD40-109 (V), PyWD40-25, PyWD40-58, PyWD40-136, PyWD40-178 (VI), and PyWD40-241 (VIII).Fig. 6. Relative expression levels of WD40 genes from P. yunnanensis under salt stress. A Transcriptome heatmap of P. yunnanensis WD40 genes under salt stress. CK, untreated control; T1: 25 mM NaCl treatment for 2 d; T4: 75 mM NaCl treatment for 2 d after three-round treatments. B Relative expression levels of P. yunnanensis WD40 genes determined by qRT-PCR
To validate these patterns, qRT-PCR was performed on 14 representative differentially expressed PyWD40 genes (Fig. 6B). All 14 genes (PyWD40-19, PyWD40-32, PyWD40-59, PyWD40-76, PyWD40-113, PyWD40-156, PyWD40-158, PyWD40-170, PyWD40-181, PyWD40-184, PyWD40-207, PyWD40-213, PyWD40-217, PyWD40-224) exhibited upregulated expression trends under salt stress. Most were induced by both T1 and T4 treatment, with only PyWD40-213 and PyWD40-19 showing specific induction under T1 or T4, respectively. These results confirmed the transcriptome data and underscore the functional diversity of WD40 genes in the salt stress adaptation.
Analysis of the protein interaction network of the WD40 family in P. yunnanensis
To elucidate the regulatory network of WD40 proteins, a protein–protein interaction network was predicted for representative PyWD40s using the STRING database (Fig. 7A, Table S9). Beyond interactions with other WD40-domain proteins, PyWD40s were predicted to partner with proteins involved in diverse processes: signal transduction (e.g., transducin beta-like protein, calcium channel protein); nucleic acid metabolism/repair (RNase H, spliceosome, rRNA processing factors); energy synthesis and metabolism (ribosomal protein, rRNA biogenesis protein RRP5, nucleolar proteins, actin-related proteins, ATPase, MAD2); transcriptional regulation (transcription initiation factor TFIID subunits); and proteins with special domains (e.g., zinc finger, sas10, CBF, BED-type, RNA-binding, BUB1 N-terminal, HORMA domains). Correlation analysis revealed that the expression trends of many interacting proteins under salt stress were coordinated with those of their PyWD40 partners (Fig. 7B, Table S10). For instance, PyWD40-19 and its interactors (RNase H, BUB1 N-terminal, HORMA domain proteins), PyWD40-32 and its partners (Peptidase_M1, TAFII55_N, TAFII28, TFIIDl), and PyWD40-156 with an interacting ATPase, all showed concurrent upregulation under stress. These co-expression patterns suggested that PyWD40s may function in concert with specific partners to facilitate the salt stress response.Fig. 7. Interaction network of P. yunnanensis WD40 proteins and the relative expression of the interacting proteins. A Interaction network of P. yunnanensis WD40 proteins. The core-colored balls represent the representative P. yunnanensis WD40 proteins. The colorful balls around represent the interacting proteins of P. yunnanensis WD40 proteins. B Heatmap of the relative expression levels of the interacting proteins of P. yunnanensis WD40. The relative expression values were obtained from transcription data
Discussion
Genomic diversity and evolutionary conservation of WD40 proteins in Populus
The WD domain, characterized by Trp-Asp (WD) dipeptide repeats, is the hallmark feature of this protein family [1]. Consistent with previous studies across various plant species [17, 21–28], our work confirms that WD40 protein copy number varies substantially and shows no strict correlation with genome size or complexity. This trend is evident within the genus Populus, where we identified between 250 and 427 WD40 proteins across six species (Table S1). In P. yunnanensis, the 258 identified PyWD40 proteins exhibit significant diversity in key physicochemical properties, including molecular weight, isoelectric point (pI), stability, and predicted subcellular localization (Table 1), suggesting broad functional diversification within this family [18, 21, 33].
Phylogenetic analysis classified the PyWD40 proteins into eight distinct subfamilies (Fig. 1, Table S2). The number of subfamilies differs from reports in species like tomato, walnut, and wheat [21, 26, 28], underscoring lineage-specific evolutionary paths. The evolutionary trajectory of these genes is characterized by purifying selection (Ka/Ks < 1; Table S5), and expansion via both segmental and tandem duplication events (Fig. 2, Tables S4, S5), a common mechanism for gene family diversification [34, 35]. Furthermore, strong collinearity and sequence conservation with A. thaliana WD40s highlight the deep conservation of this family (Fig. 2B, Table S3) [36].
Structural and regulatory divergence underpins subfamily functional specialization
Beyond the conserved β-propeller structure that mediates protein–protein interactions [37], our analysis uncovers critical variations likely responsible for subfamily-specific functions. Although all subfamilies possess the core WD40 domain, we identified 27 distinct auxiliary domains (Table 2, Fig. 4, Figure S2). Importantly, the distribution of these domains is non-random and exhibits strong subfamily associations, providing primary evidence for functional divergence. For example, the LisH and CTLH domains are predominantly co-localized in Subfamily VI members (e.g., PyWD40-14, PyWD40-33, PyWD40-34), whereas the RING-Ubox domain is sparsely distributed across Subfamilies II, V, VI, and VIII. Given the known roles of these domains in protein dimerization, ubiquitination, and transcriptional regulation [38–40], their distinct distribution patterns imply specialized molecular functions for the respective subfamilies.
This structural divergence is paralleled by distinct regulatory landscapes. Cis-acting element analysis revealed minimalistic promoter profiles in subfamilies II and III, containing primarily light-responsive or combined light/anaerobic induction elements. In stark contrast, subfamilies IV through VIII possess complex promoters enriched with elements responsive to ABA, MeJA, and growth regulators; GA-responsive elements area distinctive hallmark of subfamilies IV and V. Subfamily I displays an intermediate and heterogeneous profile, where most members share a core set of stress-responsive elements but notable exceptions (e.g., PyWD40-43, PyWD40-59, PyWD40-226) exist, hinting at sub-functionalization (Fig. 5, Table S7). This clear partitioning of promoter architectures strongly suggests that the eight subfamilies are governed by different transcriptional programs and participate in distinct biological processes.
The phylogenetic association with functionally characterized Arabidopsis WD40 proteins provides compelling support for these functional predictions. For example, several PyWD40s within the clade containing Subfamily V cluster closely with Arabidopsis MSI1. Since AtMSI1 is a core component of the chromatin-remodeling FIS/PRC2 complex involved in gene silencing and development, this association suggests that Subfamily V members may have conserved roles in epigenetic regulation and ribosome biogenesis [41]. Similarly, the placement of stress-related genes like Arabidopsis XIW1 (involved in drought stress) within clades containing Subfamilies VIII members corroborates the inference from cis-element analysis that these subfamilies are specialized for abiotic stress response [14]. Thus, the phylogenetic tree serves not only as a classification tool but also as a functional roadmap, leveraging knowledge from model species to generate testable hypotheses about PyWD40s functions.
Integrated expression and interaction networks support subfamily roles in salt stress adaptation
The functional relevance of this subfamily classification is corroborated by salt stress response data. Expression profiling showed that members of the "complex" subfamilies (IV-VIII) were prominently upregulated under salt stress (Fig. 6), which aligns with their enrichment for ABA- and MeJA-responsive promoter elements. The protein interaction network offers mechanistic insights: the coordinated upregulation of PyWD40s like PyWD40-19, PyWD40-32, and PyWD40-156 with predicted partners—such as transcription initiation factors (TFIID), RNase H, and ATPases—suggests that different subfamilies contribute to salt tolerance via distinct mechanisms involving transcriptional regulation, nucleic acid metabolism, and energy homeostasis (Fig. 7, Tables S9, S10) [42–44]. The heterogeneous expression patterns within Subfamily I further align with its variable cis-element composition, validating the predicted functional divergence.
In summary, the eight-subfamily classification of PyWD40s is not merely structural but constitutes a functionally informative framework. This framework, supported by integrated evidence from domains architecture, promoter elements, and expression dynamics, leads us to hypothesize that Subfamilies II and III are specialists in basal environmental sensing, whereas Subfamilies IV-VIII are genera lists adept at integrating complex hormonal and stress signals, particularly during abiotic stress adaptation like salinity.
Conclusion
This study identified 258 WD40 proteins in P. yunnanensis, which were phylogenetically classified into eight subfamilies. These subfamilies have undergone expansion through gene duplication events and exhibit high sequence conservation. The prevalence of β-sheet structures in PyWD40 proteins supports their potential as scaffolds for protein–protein interactions. Protein interaction network analysis revealed that PyWD40s partner with proteins involved in various stress response pathways, and their co-expression under salt stress provides mechanistic clues linking PyWD40 function to specific regulatory networks and promoter elements. Collectively, these findings advance our understanding of the molecular basis for salt stress tolerance in P. yunnanensis.
Material and methods
Identification and physicochemical property analysis of WD40 protein family in P. yunnanensis
The hidden Markov model (HMM) profile for the WD40 domain (PF00400) was obtained from the Pfam database (http://pfam.xfam.org) [45]. Homologous WD40 proteins were identified from the whole-genome sequences of six Populus species (P. yunnanensis, P. trichocarpa, P. tomentosa, P. euphratica, P. alba, and P. deltoides) using HMMER3.0 software with an E-value cutoff of < 0.05 [46]. Redundant sequences were removed. The presence of the characteristic WD40 domain in all candidate proteins was further verified using the batch CD-search tool on the NCBI website (https://www.ncbi.nlm.nih.gov/) [47] and the SMART online database (https://smart.embl.de/) [48].
Physicochemical properties of the identified P. yunnanensis WD40 proteins, including amino acid number, molecular weight, theoretical isoelectric point (pI), instability index, aliphatic index, and grand average of hydropathicity (GRAVY) were predict using TBtools [49]. Subcellular localization was predicted using WOLF PSORT website (https://wolfpsort.hgc.jp/) [50].
Phylogenetic analysis and classification of the WD40 family in P. yunnanensis
Protein sequences of WD40 members from P. yunnanensis and A. thaliana [27], were aligned using MAFFT (https://mafft.cbrc.jp/alignment/software/) [51]. A maximum likelihood (ML) phylogenetic tree was constructed with FastTree software (version 2.1.11) [52] under the LG amino acid substitution model with a CAT approximation (20 rate categories).
Tree topology was optimized using a combination of normal search, Nearest Neighbor Interchange (NNI), and Subtree Pruning and Regrafting (SPR) strategies. Branch support was assessed with Shimodaira–Hasegawa (SH)-like approximate likelihood ratio tests based on 1000 replicates. The resulting tree was visualized and annotated using the ITOL website (https://itol.embl.de/) [53], a Chlamydomonas reinhardtii WD40 protein Cre06.g302750_4532 was used as the outgroup (https://phytozome-ext.jgi.doe.gov/report/gene/CreinhardtiiCC_4532_v6_1/Cre06.g302750_4532). P. yunnanensis WD40 proteins were classified into eight subfamilies based on phylogenetic clustering and the established classification of A. thaliana WD40s.
Chromosomal localization and collinearity analysis of the WD40 family in P. yunnanensis
Chromosomal locations of P. yunnanensis WD40 genes were mapped using TBtools based on genome annotation data (GFF file). Intraspecific collinearity among P. yunnanensis WD40 genes was analyzed using TBtools with the whole-genome sequence and annotation as input. Interspecific collinearity between P. yunnanensis and A. thaliana WD40s genes was also investigated using TBtools. The nonsynonymous to synonymous substitution rate ratio (Ka/Ks) for duplicated P. yunnanensis WD40 gene pairs was calculated using TBtools.
Protein model prediction and multiple sequence alignment of WD40 proteins in P. yunnanensis
Representative protein sequences from each of the eight WD40 subfamilies in P. yunnanensis were submitted to the SWISS-MODEL online website (https://swissmodel.expasy.org/interactive/) [54] for homology-based three-dimensional structure prediction. The resulting structural models, along with multiple sequence alignment files generated by MAFFT (see Sect. 2), were used as input for ESPript 3.0 (https://espript.ibcp.fr/ESPript/ESPript/index.php) [55] to generate figures integrating sequence alignment and secondary structure visualization.
Analysis of conserved protein sequences, domains and genes structure of the WD40 family in P. yunnanensis
Conserved motifs within the WD40 protein sequences were identified using the MEME website (https://meme-suite.org/meme/tools/meme) [56], with a parameter setting to discover up to 10 motifs. Protein domains were predicted using the Batch CD-search tool on the NCBI website (https://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi). Gene structure diagrams (exon–intron organization) and integrated visualizations combining motif, domain, and gene structure information were generated using TBtools based on the P. yunnanensis genome annotation (GFF3 file).
Prediction of cis-acting elements within the WD40 family of P. yunnanensis
The 2000 bp genomic sequences upstream of the translation start site of each WD40 genes were extracted using TBtools. Cis-acting regulatory elements within these promoter regions were predicted using the PlantCARE website (https://bioinformatics.psb.ugent.be/webtools/plantcare/html/) [57]. Results were visualized using TBtools.
Expression profile analysis of the WD40 family in P. yunnanensis under salt stress and quantitative real-time PCR (qRT-PCR)
Transcript expression levels (FPKM values) for WD40 genes were obtained from a previously published RNA-seq dataset (NCBI BioProject: PRJNA998345) generated from P. yunnanensis leaves under control (CK), short-term low-concentration salt stress (T1: 25 mM NaCl, 2 d), and long-term high-concentration salt stress (T4: 75 mM NaCl, 2 d) conditions [58]. A heatmap was generated based on log2- transformed FPKM values using TBtools.
To validate the RNA-seq results, total RNA was isolated from leaf samples (CK, T1, T4) using a Plant Total RNA Extraction Kit (TIANGEN, DP190813, Beijing, China). First-strand cDNA was synthesized using a commercial reverse transcription kit (TIANGEN, AT321, Beijing, China). qRT-PCR was performed on selected differentially expressed WD40 genes using gene-specific primers (Table S11). The P. yunnanensis gene Poyun22323, identified as a stable reference from the transcriptome data, was used as an internal control [59].
Protein–protein interaction network analysis of the WD40 family in P. yunnanensis
Representative P. yunnanensis WD40 protein sequences were submitted to the STRING database (https://cn.string-db.org/) [60] to predict potential protein–protein interactions. Predictions were restricted to the Populus taxon, and only interactions with the highest confidence score were retained for further analysis.
Supplementary Information
Supplementary Material 1. Supplementary Material 2. Supplementary Material 3. Supplementary Material 4. Supplementary Material 5. Supplementary Material 6. Supplementary Material 7. Supplementary Material 8. Supplementary Material 9. Supplementary Material 10. Supplementary Material 11. Supplementary Material 12.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Sondek J, Bohm A, Lambright DG, Hamm HE, Sigler PB. Crystal structure of a G-protein beta gamma dimer at 2.1A resolution. Nature. 1996;379(6563):369–374. https://doi.org/10.1038/379369 a 0.10.1038/379369 a 08552196 · doi ↗ · pubmed ↗
- 2Yan C, Yang T, Wang B, Yang H, Wang J, Yu Q. Genome-Wide Identification of the WD 40 Gene Family in Tomato (Solanum lycopersicum L.). Genes. 2023;14(6):1273. 10.3390/genes 14061273.10.3390/genes 14061273 PMC 1029811737372453 · doi ↗ · pubmed ↗
- 3Lin T, Zhu X, Zhang F. Wan X. The Interaction Effect of Cadmium and Nitrogen on Populus yunnanensis. Journal of Agri Sci. 2012;4(2):1518–26. 10.5539/jas.v 4n 2p 125.
- 4Rubio A, de Toro M, Pérez-Pulido AJ. The most exposed regions of SARS-Co V-2 structural proteins are subject to strong positive selection and gene overlap may locally modify this behavior. m Systems 2024, 9(1):e 0071323. 10.1128/msystems.00713-2310.1128/msystems.00713-23PMC 1080494938095866 · doi ↗ · pubmed ↗
- 5Wu R, Jia Q, Guo Y, Lin Y, Liu J, Chen J, Yan Q, Yuan N, Xue C, Chen X, et al. Characterization of TBP and TA Fs in Mungbean (Vigna radiata L.) and Their Potential Involvement in Abiotic Stress Response. Int J Mol Sci. 2024;25(17):9558. 10.3390/ijms 25179558.10.3390/ijms 25179558 PMC 1139478139273505 · doi ↗ · pubmed ↗
- 6Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, Geer RC, He J, Gwadz M, Hurwitz DI, et al. CDD: NCBI's conserved domain database. Nucleic acids research. 2015;43(Database issue):D 222–226. 10.1093/nar/gku 1221.10.1093/nar/gku 1221 PMC 438399225414356 · doi ↗ · pubmed ↗
- 7Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, Nakai K. Wo LF PSORT: protein localization predictor. Nucleic acids research. 2007;35(Web Server issue):W 585–587. 10.1093/nar/gkm 259.10.1093/nar/gkm 259PMC 193321617517783 · doi ↗ · pubmed ↗
