Identification of candidate genes for heat tolerance in rice by genome-wide association study and transcriptome sequencing
You Zhou, Keyang Li, Chenghang Tang, Manqiong Zhu, Yaling Bao, Wei Zhang, Pengfei Li, Wenhao Lv, Meng Zhang, Chunni Wang, Dewen Zhang, Yingyao Shi

TL;DR
This study identifies genes in rice that help it tolerate heat during the critical heading stage, using genetic and transcriptomic data.
Contribution
The study combines GWAS and transcriptome sequencing to identify five key candidate genes for heat tolerance in rice.
Findings
206 genes related to heat tolerance were identified through GWAS.
57 genes were co-detected by GWAS and transcriptome sequencing.
Five candidate genes showed consistent expression trends verified by qRT-PCR.
Abstract
Heat stress during the heading stage has a particularly negative impact on rice. In this study, we employed 159 core rice germplasms as materials, set up high-temperature and normal-temperature treatments during the heading stage, and measured the phenotypic traits such as the seed-setting rate, the number of grains per panicle, and the thousand-grain weight. Combining the results and data from whole-genome sequencing, we employed genome-wide association study (GWAS) to identify the heat tolerance-related genes or QTLs in rice. Additionally, we selected two heat-tolerant and two heat-intolerant varieties for transcriptome sequencing after high-temperature treatment during the heading stage, and further screened and analyzed the candidate genes based on GWAS and transcriptome sequencing. As a result, a total of 206 genes related to heat tolerance during rice heading were detected by…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9- —the Key Research and Development Program of Anhui Province
- —Science and Technology of Innovative research program of Anhui Province
- —Major scientific and technological innovation research tasks of Bio-breeding Laboratory of Anhui Province
- —the National Key Research and Development Program of China
- —the National Natural Science Foundation of China
- —the Improved Varieties Joint Research (Rice) Project of Anhui Province
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Mapping and Diversity in Plants and Animals · Plant Molecular Biology Research · Heat shock proteins research
Introduction
Rice is an important food crop for more than half of the world’s population [1]. Rice production is often affected by high temperature during the reproductive stage, which has significant negative impacts on grain filling and yield [30]. Heat stress caused by global warming is a crucial environmental factor affecting crop growth and yield. It has been estimated that every 1 °C of increase in global average temperature will lead to a 3.2% decline in rice yield [49]. Moreover, the increasing frequency of extreme high-temperature events poses severe challenges to global food security [26].
Heading is a major growth stage, and is influenced by various exogenous factors such as the photoperiod, temperature, and nutrient availability, particularly the photoperiod [25, 44]. Rice are highly sensitive to temperature changes. An increase in temperature will lead to earlier heading, while a decrease in temperature will delay the heading, which is known as temperature sensitivity [5]. High temperature during the heading stage can cause spikelet sterility in rice, posing a significant threat to rice production under climate change [39]. The sensitivity of rice heading to temperature varies among different regions and genotypes. The genotypes adapted to high-latitude regions usually exhibit photoperiod insensitivity and distinct high-temperature responses [12].
In recent decades, the vigorous development of bioinformatics and rapid increase in the number of molecular markers have contributed to a sharp increase in studies using genome-wide association study (GWAS) to identify complex quantitative traits [20, 35]. GWAS has been utilized to identify causal loci for many important agronomic traits, such as grain length and thousand-grain weight (TGW) [34, 41]. However, in GWAS, the impacts of population genetic structure and allele frequency are often overlooked, which increases the probability of false positives in linkage disequilibrium (LD) mapping [46]. Combination of linkage analysis and correlation analysis has promoted the attempts to explore quantitative agronomic traits. More importantly, it has been confirmed that integration of these methods can further enhance the efficiency and accuracy of GWAS [27].
GWAS allows more precise mapping of related quantitative trait loci (QTLs) and genes, as well as paves a new way for their application in production [13]. To date, numerous QTLs related to heat tolerance have been identified and validated during the vegetative and reproductive stages on the 12 rice chromosomes. For example, qHTH5, a heat-tolerance QTL for the heading and flowering stages, was detected within a range of approximately 304.2 kb on the short arm of chromosome 5 [2]. All the three QTLs (qNS4, qHTS4, and qRRS4) in the SSR marker interval from RM471 to RM177 on chromosome 4 were found to be involved in epistatic interactions and environmental interactions and cause phenotypic variations, indicating that this interval is a major QTL hotspot [22]. Fan et al. identified three heat-tolerance QTLs and two novel heat-tolerance loci (qTT4 and qTT5) for longistaminata at the seedling stage [9]. Two major heat-tolerance QTLs, qHTSF1.1 and qHTSF4.1, are located on chromosomes 1 and 4, respectively, whose heat-tolerant alleles respectively originated from IR64 and N22 [45]. Kilasi et al. used a recombinant inbred line population constructed by crossing N22 and IR64, and detected six root-length QTLs and two shoot-length QTLs under control condition [16]. Moreover, rlht5.1 was identified as the major QTL controlling the root length under heat stress [51] Another study established a set of chromosome segment substitution lines and identified 11 QTLs related to heat tolerance, among which eight were overlapped with previously reported QTLs, and three (qPSLht4.1, qPSLht7, and qPSLht10.2) were novel QTLs [53]. A major heat-tolerance QTL qHTB3-3 was detected on the long arm of chromosome 3, which is located between the RM3525 and 3-M95markers with a physical distance of approximately 2.8 Mb. From a cross between the heat-tolerant variety Huanghuazhan and heat-sensitive variety 9311 A, a heat-tolerance QTL qHTT8 was mapped on chromosome 8, which spanned a physical distance of about 0.965 Mb [6] .
In recent years, some progress has been achieved in the research on heat tolerance of rice. For instance, the genetic locus TT3.2, which consists of the antagonistic genes TT3 and TT3.1, has been demonstrated to govern the heat tolerance of rice [48]. Another study demonstrated that OsHTAS interacts with components in the 26 S proteasome system and subtypes of rice ascorbate peroxidase, thereby enhancing rice heat tolerance by regulating hydrogen-peroxide-induced stomatal closure [24]. HTS1 is mainly expressed in green tissues such as leaves and leaf sheaths, and is significantly induced by heat stress. It encodes a β-ketoacyl-carrier-protein reductase located in the chloroplast thylakoid membrane and is involved in the biosynthesis of new fatty acids. Mutation in HTS1 was found to impair the integrity and stability of the cell membrane system under heat stress, causing abnormal ROS and Ca²⁺ signal transduction and reducing the heat tolerance [4]. Moreover, 9-cis-epoxycarotenoid dioxygenase (NCED) is a key rate-limiting enzyme in the ABA biosynthesis pathway, and overexpression of OsNCED1 could enhance the heat tolerance of rice by increasing the antioxidant capacity [52]. OsCNGC14 regulates Ca²⁺ signaling under heat stress and is crucial for plant heat tolerance. Absence of OsCNGC14 will reduce or eliminate heat-stress-induced Ca²⁺ signaling in the cell membrane [7]. Some other studies have demonstrated that the OsNSUN2, OsSGS3, and OsTOGR1 genes target RNA to improve rice heat tolerance by enhancing mRNA modification in the detoxification system, regulating trans-acting small interfering RNAs targeting auxin-response factors, and promoting the activity of RNA-modifying enzymes [11, 38, 42].
This study employed 159 natural population materials with complete genome sequencing data (127 Xian rice varieties, 28 Geng rice varieties, three Admix varieties, and one AUS variety) for GWAS by using the mixed-linear model (MLM) and Fixed and random model Circulating Probability Unification (Farm CPU), and conducted phenotypic analysis on these germplasms under heat stress during the heading stage, aiming to identify QTLs significantly associated with heat tolerance in rice. The study also predicted the candidate genes through gene annotation, haplotype analysis, and transcriptome analysis. The findings are expected to provide theoretical and practical reference for the cultivation and breeding of heat-tolerant rice varieties.
Results
Phenotypic analysis
A statistical analysis was performed on heading-related indicators such as seed-setting rate (SSR), grain number per panicle (GNP), and TGW of all rice germplasms under normal and high-temperature conditions (Table S2). To eliminate the influence of genetic backgrounds of different germplasms, the relative values of the above-mentioned traits were calculated. As shown in Table S2, there were significant differences in the mean, standard deviation, and coefficient of variation (CV) of SSR, GNP, and TGW between the two temperature conditions. Specifically, heat stress led to significant decreases in the mean, standard deviation, and CV of SSR. Heat stress significantly amplified the phenotypic variation of the seed-setting rate within the population. The coefficient of variation surged dramatically from 9.26% under the control condition to 68.38%. Simultaneously, the extreme-value of the population’s seed-setting rate contracted markedly. Specifically, the maximum value declined from 0.96 to 0.52, and the minimum value decreased from 0.60 to 0.05. This indicated that each trait exhibited abundant genetic variations under different conditions, and the phenotypic values of most traits showed a normal distribution (Figure S1). Therefore, it can be speculated that all traits are quantitative traits and can be used as indicators to evaluate the heading ability of rice under high-temperature condition.
Correlation analysis
We set nine indicators related to heading under different temperature conditions, including HGNP (grain number per panicle under high-temperature stress), HSSR (seed-setting rate under high-temperature stress), HTGW (thousand-grain weight under high-temperature stress), CKGNP (grain number per panicle under normal condition), CKSSR (seed-setting rate under normal condition), CKTGW (thousand-grain weight under normal condition), RGNP (relative grain number per panicle), RSSR (relative seed-setting rate), and RTGW (relative thousand-grain weight). Correlation analysis showed that most of these indicators were significantly or extremely significantly correlated with each other (Fig. 1). Traits such as HGNP and HTGW showed an almost absolute positive correlation (correlation coefficient Corr. = 0.975**), indicating a tight phenotypic association among such traits. In contrast, some trait combinations (such as CKSSR and CKGNP, RGNP and RSSR) did not show significant correlation (correlation coefficient Corr. approaching 0), suggesting that the variations of these traits are independent of each other in this population. Therefore, all the nine indicators can be used for subsequent GWAS analysis.
Fig. 1. Correlations of the nine indicators under high-temperature and normal conditions. Note: HGNP (grain number per panicle under high-temperature stress), HSSR (seed-setting rate under high-temperature stress), HTGW (thousand-grain weight under high-temperature stress), CKGNP (grain number per panicle under normal condition), CKSSR (seed-setting rate under normal condition), CKTGW (thousand-grain weight under normal condition), RGNP (relative grain number per panicle), RSSR (relative seed-setting rate), RTGW (relative thousand-grain weight)
Phylogenetic and population structure analysis
A total of 4,817,964 independent SNPs were used for genetic structure analysis (Figure S2). The phylogenetic tree showed that the 159 germplasms could be classified into four groups, including 127 Xian rice varieties, 28 Geng rice varieties, three admixed rice varieties (intermediate types), and one Aus rice variety (Fig. 2a). Similar results were obtained in principal component analysis (PCA), where the first two principal components (PCs) could explain most of the genetic variations. A comparison of the two PCs revealed that most germplasms were clustered along PC2 (Fig. 2b). The outcomes of population-structure analysis suggest that the genetic background of the population under study displays remarkable population-specific characteristics. A portion of the samples is predominantly composed of a single ancestral component, whereas the rest of the samples bear a mixed genetic signal stemming from multiple ancestral components(Fig. 2c). The kinship clustering heatmap analysis reveals that most of the tested germplasm materials exhibit low genetic similarity, which aligns with the genetic background characteristics of natural populations(Fig. 2d). Subsequently, an LD decay analysis was conducted on all 159 germplasms. The calculated LD decay distance amounted to 118 kb (Fig. 2e). These results demonstrated that the LD level of the 159 rice germplasms is sufficient for GWAS analysis.
Fig. 2. Population structure analysis of different rice germplasms. Note: a: Phylogenetic tree of 159 rice germplasms, including xian rice in green, Geng rice in orange, Admix in blue, and AUS in mauve. b: Principal component analysis of the phylogenetic tree of 159 rice germplasms. c: Population structure analysis of 159 rice germplasms d: Heatmap of the relationships among 159 rice germplasms. e: LD decay curve of 159 rice germplasms
Genome-wide association study of traits related to heading under heat stress
GWAS was conducted on nine indicators related to heading under heat stress in 159 rice germplasms using the Farm CPU and MLM models based on the TASSEL method. The Farm CPU and MLM models were used to screen important loci through GWAS, with a threshold of P = 1.31 × 10⁻⁵. SNPs were merged based on the LD decay distance. A total of 105 significantly associated SNP loci were detected under the two models (Table S3, Fig. 3, Figures S3, S4 and S5), which were located within 105 QTL intervals (Figure S6).
Fig. 3. Manhattan plot and quantile-quantile plot (Q-Q plot) of the seed setting rate under heat stress (HSSR) in rice under heat stress
Screening of candidate genes for heat tolerance in rice germplasms
Among the 105 significantly associated SNP loci detected, 24 loci were significantly associated with multiple indicators, while each of the remaining SNP loci was significantly associated with single indicator. The SNP locus 262,351 on chromosome 9 demonstrated a significant association with four traits: HTGW, RTGW, HSSR, and RSSR. Concurrently, the SNP locus 35,668,943 on chromosome 2 exhibited significant associations with three traits: HTGW, RTGW, and HSSR. These findings imply that genes situated in the vicinity of these key “pleiotropic” SNP loci might assume a pivotal role in regulating yield-related traits in rice under both heat and normal conditions. After removal of genes annotated as transposons and anti-transposon proteins, 1232 genes were mapped within the 105 QTL intervals (Table S4), which were then analyzed using the Rice Annotation Project Database. Finally, 206 genes that might play a role in the heat tolerance of rice during heading were screened according to the gene annotation information (Table S5).
Analysis of transcriptome sequencing results
Filtering and quality assessment of the sequencing data
According to the preliminary phenotypic data, we selected two heat-tolerant rice varieties, Zimi and Guanglu’ai 15 − 1 (sequencing numbers of B090 and B119, respectively), and two heat-intolerant rice varieties, BA BAI GU and Hongqi 5 (sequencing numbers of IRIS 313-10221 and B215, respectively). Then, we collected the samples, extracted the RNA, and constructed libraries under high and normal temperature conditions. A total of 165.2 Gb high-quality reads (clean reads) were obtained from the transcriptome libraries of the four varieties, with nearly 6.88 Gb of high-quality reads for each sample. The GC content and Q30 of each transcriptome were 48.77%–52.63% and 89.97%–95.88%, with an average of 50.85% and 93.39%, respectively. These results indicated sufficient quality of RNA sequencing for expression analysis (Table 1).
Table 1. Transcriptome statisticsSampleRaw ReadsClean ReadsClean Bases(G)Clean Rate(%)Q20(%)Q30(%)GC Content(%)mapped reads (%)CK27-141,191,38641,123,5546.1199.8496.3990.0851.1591.26%CK27-246,834,89046,756,5266.9299.8396.7290.7749.6790.64%CK27-341,036,11840,967,0666.0699.8397.7593.1650.3492.20%CK28-148,686,63648,606,9927.2499.8497.3492.1749.1889.62%CK28-253,250,30453,198,8347.9299.997.1291.4251.0889.80%CK28-348,556,68848,479,2707.2199.8498.1394.0949.690.28%CK29-141,364,80841,336,1366.1799.9398.6195.2649.3392.41%CK29-241,776,81241,750,1186.2399.9498.3794.4351.2393.24%CK29-364,730,44464,686,6429.6499.9398.2494.1852.6393.49%CK30-149,876,55649,841,2987.4299.9398.5595.1351.6196.35%CK30-253,169,33453,129,3207.9199.9298.7695.8351.2596.80%CK30-348,555,72048,520,6047.2299.9398.529551.7496.07%G27-141,186,52841,130,8766.1299.8696.4989.9751.6895.43%G27-242,649,86042,581,6486.3299.8497.9893.7351.1396.18%G27-346,402,53846,323,8766.8499.8397.9293.7652.1892.98%G28-143,073,93843,008,9526.3799.8596.7790.8550.5791.04%G28-243,216,64643,178,3006.3999.9196.8690.9748.7790.45%G28-349,024,61048,947,0387.2799.8496.8590.9750.1890.37%G29-149,265,86649,239,1687.3199.9598.1793.6149.9491.09%G29-241,020,27840,990,5886.0999.9398.3994.6450.2591.01%G29-344,979,64444,947,7926.799.9398.4694.8952.0893.78%G30-141,056,01041,024,3806.1199.9298.7595.8852.2597.35%G30-248,359,97848,326,4787.299.9398.5294.9951.1597.36%G30-343,592,09443,560,6346.4399.9398.6695.551.3297.29%CK27: Weak heat tolerance variety Gunagluai15-1 under normal condition; CK28: Weak heat tolerance variety Zimi under normal condition; CK29: Strong heat tolerance variety BA BAI GU under normal condition; CK30: Strong heat tolerance variety Hongqi5hao under normal condition; G27: Weak heat tolerance variety Gunagluai15-1 under heating condition; G28: Weak heat tolerance variety Zimi under heating condition; G29: Strong heat tolerance variety BA BAI GU under heating condition; G30: Strong heat tolerance variety Hongqi5hao under heating condition
Analysis of differentially expressed genes
By analyzing the phenotypic data, the heat-tolerant samples (Zimi, Guanglu’ai 15 − 1) were labeled as G27 and G28 under high-temperature condition, and as CK27 and CK28 under normal condition, respectively. The heat-intolerant samples (BA BAI GU, Hongqi 5) were labeled as G29 and G30 under high-temperature condition, and as CK29 and CK30 under normal condition, respectively. Based on these eight samples, a total of eight comparison groups were constructed, including G27 vs. G29, G27 vs. G30, G28 vs. G29, G28 vs. G30, CK27 vs. CK29, CK27 vs. CK30, CK28 vs. CK29, and CK28 vs. CK30. As shown in Fig. 4, the gene expression levels were significantly different not only between different varieties but also between different treatments.
Fig. 4. Volcano plots of gene expression in eight comparison groups under normal and high-temperature conditions. Note: Volcano plots of (a) CK27 vs. CK29, b CK27 vs. CK30, c CK28 vs. CK29, d CK28 vs. CK30, e G27 vs. G29, f G27 vs. G30, g G28 vs. G29, and h G28 vs. G30
The distribution of differentially expressed genes (DEGs) in the four control groups under each condition was compared by Venn diagrams (Fig. 5). In terms of DEGs between heat-tolerant and in-tolerant varieties, the four comparisons of the control groups shared nine DEGs and the four comparisons of the high-temperature groups shared 136 DEGs.Utilizing the transcriptome data of four rice varieties under normal and high-temperature conditions, the hierarchical-clustering heatmap of 57 key differentially expressed genes vividly depicts the substantial divergence in their expression patterns between heat-tolerant and heat-sensitive varieties(Figure S7).
Fig. 5. Distribution of differentially expressed genes (DEGs) between heat-tolerant and heat-intolerant varieties under different conditions. Note: a Distribution of DEGs in different varieties under normal condition. b Distribution of DEGs in different varieties under high-temperature condition
Functional annotation and enrichment analysis of differentially expressed genes
GO classification of differentially expressed genes
The top 15, 10, and 10 valid terms of BP, CC, and MF were selected and presented in a histogram. In Fig. 6, the abscissa represents the GO terms, and the ordinate represents the number of genes enriched in the GO terms, with a higher value indicating greater significance. In terms of biological processes (BP), the terpenoid metabolic process, terpenoid biosynthetic process, and isoprenoid metabolic process were significantly enriched in all three comparison groups. Regarding cellular components (CC), it was mainly related to the chloroplast stroma, plastid stroma, thylakoid membrane, and photosynthetic membrane. This indicates that high temperatures severely disrupt the photosynthetic system of rice. In terms of molecular functions (MF), active transmembrane transporter activity was also present in the three comparison groups. This suggests that heat stress may generally affect crucial transmembrane transport functions such as the maintenance of ion balance and metabolite transport in cells, which could be an important factor contributing to the impairment of photosynthesis and the disruption of cellular homeostasis.
Fig. 6GO annotation classification of differentially expressed genes in four comparison groups under normal condition and high temperature condition. Note: GO annotation classification of (a) CK27 vs. G27, b CK28 vs. G28, c CK29 vs. G29, d CK30 vs. G30. BP: biological process; CC: cellular component; MF: molecular function
KEGG enrichment analysis
In organisms, different DEGs usually coordinate with each other to exert their biological functions. Pathway-based analysis can reveal their biological functions, and pathway significance enrichment analysis can be used to identify the most important biochemical metabolic pathways and signal transduction pathways in which the DEGs are involved. Hence, to further screen the genes that affect rice heading, growth, and development under heat stress, we performed a KEGG metabolic pathway analysis on the DEGs.
Considering that the enriched genes showed different responses to heat stress in rice germplasms with different levels of heat tolerance, a KEGG analysis was carried out separately for different rice germplasms under high-temperature and normal conditions (Fig. 7). Under normal condition, the pathways significantly enriched in the four combinations of CK27 vs. CK29, CK27 vs. CK30, CK28 vs. CK29, and CK28 vs. CK30 were mainly the biosynthesis of various secondary metabolites (osa00999) and metabolism of starch and sucrose (osa00500). Obviously, these pathways were enriched due to the differences among the four combinations resulting from different varieties. Under high-temperature condition, in the four combinations of G27 vs. G29, G27 vs. G30, G28 vs. G29, and G28 vs. G30, besides those pathways significantly enriched under control condition, there were some other significantly enriched pathways, such as phenylalanine, tyrosine, and tryptophan biosynthesis (osa00400), amino sugar and nucleotide sugar metabolism (osa00520), cofactor biosynthesis (osa01240), nucleotide sugar biosynthesis (osa01250), glutathione metabolism (osa00480), and motor proteins (osa04814). These pathways may be important components of the regulatory mechanism of rice heat tolerance.
Fig. 7KEGG annotation classification of differentially expressed genes in four rice varieties with different heat-tolerance levels. Note: a KEGG enrichment analysis under normal condition; b KEGG enrichment analysis under high-temperature condition
Analysis of candidate genes for heat tolerance during rice heading
To identify the candidate genes for heat tolerance, based on the results of transcriptome sequencing, the DEGs in the eight comparisons under normal and high-temperature conditions were used as a gene set. As a result, 2, 24, 8, 23, 22, 5, 20, 17, and 8 genes were mapped to CKSSR, CKGNP, CKTGW, HSSR, HGNP, HTGW, RSSR, RGNP, and RTGW, respectively. After removal of the duplicates, a total of 57 genes were co-localized in GWAS and transcriptome analysis (Table S6). These candidate genes were further screened through the Rice Genome Annotation Project Database based on gene functional annotation. Finally, five candidate genes were identified, including LOC_Os03g18010 (located at qCKTGWFarmCPU3), LOC_Os07g43700 (located at qCKGNPFarmCPU7, qCKGNPMLM7, qHGNPFarmCPU7, and qHGNPMLM7), LOC_Os10g22980 (located at qHSSRFarmCPU10.1, qRSSRFarmCPU10.1), LOC_Os10g41660 (located at qHSSRFarmCPU10.2, qRSSRFarmCPU10.2), and LOC_Os11g39020 (located at qCKGNPFarmCPU11.2, qHGNPFarmCPU11.2).
Haplotype analysis of candidate genes
Among the five candidate genes, LOC_Os03g18010 encodes phospholipase C. Previous research has indicated that phospholipid signaling is a crucial component of the heat stress response [29]. As depicted in Fig. 8, three major haplotypes were detected at one SNP site in the promoter, three SNP sites in the exon, and one SNP site in the intron of LOC_Os03g18010. The average CKTGW values of Hap-A, Hap-B, and Hap-C were 22.75 g, 23.67 g, and 23.81 g, respectively. Hap-A showed significant differences in CKTGW from both Hap-B and Hap-C.
Fig. 8. Haplotype analysis of *LOC_Os03g18010. Note: *Identification of candidate genes for CKTGW. a Based on six SNPs in all evaluated rice germplasms, three haplotypes of LOC_Os03g18010 were identified. In the gene structure diagram of LOC_Os03g18010, the exon is represented by blue frame. The thin blue line represents the genomic location of each SNP. Haplotypes with fewer than 10 germplasms are not shown. b CKTGW based on single polymorphism and LD heat map of local Manhattan map. c Around the peak on chromosome 3, red dotted lines represent candidate regions for associated SNPs. d Based on CKTGW of LOC_Os03g18010 haplotype, differences between haplotypes were statistically analyzed using the Tukey’s test
The function of LOC_Os07g43700 is to encode lactate/malate dehydrogenase. It has been shown that this type of enzymes play a variety of roles in response to abiotic stresses, such as hypoxia, salinity, and high temperature [3]. Three major haplotypes were detected at 11 SNP sites in the promoter and six SNP sites in the exon of LOC_Os07g43700 (Figure S8). LOC_Os10g22980 encodes a leucine-rich repeat domain-containing protein. It has been demonstrated that leucine-rich repeat extensins (LRX) are crucial for the growth and development of rice roots in response to heat stress [21]. Three major haplotypes were detected at three SNP sites in the promoter, eleven SNP sites in the exon and thirteen SNP sites in the intron of LOC_Os10g22980 (Figure S9). LOC_Os10g41660 encodes a C3HC4-type zinc-finger protein. A previous study has shown that the up-regulation of zinc-finger protein gene OsZFP350 can significantly increase the germination rate of seeds under abiotic stress and alleviate the damage of heat stress during rice root development [15]. Two major haplotypes were detected at one SNP site in the promoter and five SNP sites in the exon of LOC_Os10g41660 (Figure S10). LOC_Os11g39020 has been reported to encode an ABC transporter. ABC transporters were found to be involved in stabilizing the lipid distribution in cell membranes under heat stress, as well as in regulating the accumulation of carbohydrates, proteins, and lipids in grains [17]. Three major haplotypes were detected at two SNP sites in the promoter, fourteen SNP sites in the exon and seven SNP sites in the intron of LOC_Os11g39020 (Figure S11). The results of statistical analysis indicate that there are significant associations between the haplotypes of the five candidate genes and the target agronomic traits (Table S9). Following a comparative analysis of the phenotypic means of all haplotypes for the five candidate genes, it was discovered that even among some haplotypes with small sample sizes, there existed excellent haplotypes. Examples include HAPN of Os07g0630800 and HAPK of Os10g0376200(Table S10).
qRT-PCR validation
To validate the transcriptome data, the five candidate genes were further analyzed by qRT-PCR, with specific primers designed for these genes (Table S7). To estimate the correlation between fragments per kilobase of exon per million reads mapped (FPKM) and relative gene expression, the Pearson correlation coefficient (PCC) was calculated. The PCC value was the highest for Os03g0289300 (0.995), and higher than 0.7 for most of the genes. The relative expression of these candidate genes was highly consistent with the RNA-seq data (Fig. 9). These results confirmed that the transcriptome data could be used for further downstream analysis. As shown in Fig. 9, Os03g0289300 and Os10g0566400 were positively regulated, while Os07g0630800, Os10g0376200, and Os11g0603200 were negatively regulated under heat stress.
Fig. 9. Validation of the expression patterns by qRT-PCR of candidate genes from RNA-seq analysis and GWAS analysis. Note: The Y-axis on the left represents the relative expression level and its corresponding bar chart with standard deviation (presented as error bars in each graph). The Y-axis on the right represents the expression level calculated by the Fragments Per Kilobase of exon per Million mapped reads (FPKM) method, corresponding to the line graph in each graph. Moreover, the X-axis of each graph represents different rice varieties under two treatments. R denotes the Pearson correlation coefficient (PCC) between the FPKM and the relative expression level
Discussion
Genome-wide association study
Many QTL have been identified by linkage mapping; however, fine-mapping requires the time-consuming establishment of large segregating populations [28, 36, 47]. With the development of sequencing technology, GWAS has become an important tool for detecting natural variations related to plant traits. In the past decade, GWAS has been used to successfully map a large number of potential loci and key genes in rice, which have been further confirmed by subsequent functional experiments [18, 23, 37]. This study conducted GWAS on CKSSR, CKGNP, CKTGW, HSSR, HGNP, HTGW, RSSR, RGNP, and RTGW of 159 rice germplasms, resulting in the identification of 105 significant QTL intervals. Numerous studies center their analyses on the rice seedling stage as research advancements can be achieved more expeditiously at this phase. Nevertheless, in the current study, we examined the impacts of heat stress on rice during the heading and flowering stage. Consequently, the candidate quantitative trait loci (QTLs) identified through genome-wide association study (GWAS) mapping in our research diverge from those previously reported. Additionally, the known heat-responsive genes were detected in our analysis, corroborating the reliability of this study. One example is OsAKR1 (Os01g0847600), which plays a positive role in abiotic stress-related pathways by detoxifying malondialdehyde and methylglyoxal and enhancing plant tolerance to oxidative and heat stresses [40].
Transcriptome sequencing analysis
Transcriptome analysis has facilitated the identification of candidate genes that function the response to heat stress, and can limit the range of candidate genes detected by QTL mapping or GWAS. Genome-wide gene expression profiling using transcriptome sequencing analysis (RNA-seq) can effectively analyze the gene regulatory networks of complex agronomic traits and characterize plant responses to environmental stresses [43]. In this study, transcriptome sequencing identified a total of 14,227 DEGs between heat-tolerant varieties and heat-intolerant varieties under both high-temperature and normal conditions, which were then used as the gene set for testing of heat tolerance. Via the integration of genome - wide association study (GWAS), differentially expressed genes (DEGs), and gene annotation information, 57 related genes were screened. Subsequently, through the incorporation of haplotype analysis, 6 candidate genes associated with heat tolerance were identified (Table S8).Among these six genes, SCT2 (LOC_Os10g22950), which encodes a Ca^2+^-sensing transcription factor, has been previously studied. SCT1 and SCT2 can directly bind to the promoter of OsWR2 and inhibit its expression, thereby negatively regulating the heat tolerance of rice [14].
Identify the favorable haplotypes of candidate genes
A haplotype is a specific combination of alleles (markers) observed on a chromosomal segment within a given population [50]. In crop genetics, identification of superior haplotypes with important functional genes is of great significance for deciphering gene functions and applications [10]. We meticulously presented the genotypic and phenotypic characteristics of haplotypes having a sample size of ≥ 10. Simultaneously, a comparative analysis of the mean values of haplotypes with a sample size of < 10 was conducted (Table S10), with the aim of identifying potential superior haplotypes. Haplotype analysis is based on SNP analysis, mainly exploring whether changes occur in open reading frames (ORFs), frameshifts, and splice sites. Subsequent studies can further explore other types of genetic variations, including deletions, insertions, and duplications in specific regions of the genome.
Conclusion
In this study, the traits related to heat tolerance exhibited abundant phenotypic variations with a normal distribution, indicating that these germplasms are suitable for GWAS analysis. A total of 105 QTL intervals involving 1232 genes significantly associated with heat tolerance were detected. These 1232 genes were then analyzed using the Rice Annotation Project Database. Finally, 206 genes that may play a role in the heat tolerance of rice were screened based on gene annotation information. In addition, a gene set for heat tolerance was established using the transcriptome sequencing results of heat-tolerant and heat-intolerant varieties under normal and high-temperature conditions. A total of 57 genes were co-located in GWAS and transcriptome analysis. Finally, by integrating GWAS with the heat tolerance gene set and gene annotation, five genes (LOC_Os03g18010,* LOC_Os07g43700*,* LOC_Os10g22980*,* LOC_Os10g41660*, and LOC_Os11g39020) were identified as candidate genes for heat tolerance in rice. qRT-PCR verified that the relative expression trends of these five genes were highly consistent with the RNA-seq data.
Materials and methods
Experimental materials
The experimental materials were obtained from 159 natural populations, which included 127 Xian rice varieties, 28 Geng rice varieties, three Admix varieties, and one AUS variety (Table S1).
High-temperature treatment of rice during heading and phenotype determination
The 159 germplasms were sown on May 20th and transplanted on June 15th, with a seedling age of 25 days. A pot-cultivation method was adopted. Four pots were transplanted with each variety, with two pots used for high-temperature treatment and two pots as the control. The plastic pots were 30 cm in width and 20 cm in height. Each pot was filled with 1 kg of substrate and 3 kg of air-dried soil. Fertilizer and water were prepared according to the ratio of N: P: K = 10: 5: 7, with an application rate of 10 g per pot. From transplanting to the booting stage, a water layer of 1–2 cm was maintained, and during the booting stage, the water layer was 3 cm. When reaching the initial heading stage, two pots of the treatment group were moved into the greenhouse. At the same time, the panicles that would head the next day were tagged. The high-temperature treatment lasted for 5 days. During the treatment period, each day was divided into five time intervals, which started at 09:30 and lasted for 5 h. The maximum temperature was set at 40 °C (± 0.1 °C), and the humidity in the greenhouse was maintained at 60%–70%. In the remaining periods, the minimum temperature was not lower than 29 °C. Six days after the treatment, the pots were moved out of the greenhouse and placed back in the original field before 9:30. The two control pots with the corresponding sowing date grew under natural condition at their original positions. In the later stage, the soil was kept in a wet-dry cycle until the yellow ripening stage. During the sowing stage, 1–2 g of top-dressing fertilizer was applied to each pot.Six indicators, including the seed setting rate, number of grains per panicle, and thousand-grain weight (TGW) under both high-temperature and normal-temperature conditions, were determined. The heat tolerance of rice was evaluated in combination with the relative seed setting rate, relative number of grains per panicle, and relative TGW. R-studio software was employed to perform a correlation analysis of rice phenotypes.
Genotype analysis
SNPs with a minor allele frequency greater than 5% and a missing rate less than 10% were filtered via Plink, resulting in 4,817,964 polymorphic SNPs [31]. ITol was used to construct a phylogenetic tree [19].The genetic distance was obtained from the SNP data for a kinship analysis using the TASSEL software (version 5.2.40). By using the PopLDdecay software (version 3.43), the physical distance at which the LD coefficient decreased to half of the average maximum value (0.25) across the whole chromosome was identified as the LD decay distance. The principal component analysis (PCA) plot, kinship plot, linkage disequilibrium plot, and correlation analysis plot were all generated via the R programming language.
Genome-wide association study
GWAS were conducted using the rMVP package in R software(https://cran.r-project.org/web/packages/rMVP/). Fixed and random model Circulating Probability Unification (FarmCPU) and mixed linear model (MLM) were employed for association mapping [32]. A significance threshold of 1.31E-5 was applied to identify significant SNP markers. Manhattan plots and quantile-quantile (Q-Q) plots were generated using CMplot in R software(https://cran.r-project.org/web/packages/CMplot/). Candidate genes located within a 400 kb window centered on each significant SNP marker were extracted and annotated was obtained from the Ensembl Genome Database.
Sampling, RNA extraction and sequencing
Based on the results of preliminary phenotypic identification, two heat-tolerant varieties (Zimi and Guanglu’ai 15 − 1) and two heat-intolerant varieties (BABAI GU and Hongqi 5) were selected. The four previously-mentioned rice varieties, which exhibited uniform growth and consistent heading-flowering stages, were chosen and cultivated under two experimental conditions: ambient temperature (control) and 40 °C high-temperature stress. Specifically, the high-temperature treatment group was exposed to continuous stress for 5 days, while the ambient-temperature control group was cultivated synchronously for the same period. For each variety, three biological replicates were set up under both conditions. Once the high-temperature stress treatment was completed, the functional leaves of plants of each genotype were collected, promptly snap-frozen in liquid nitrogen, and then transferred to an ultra-low-temperature refrigerator at − 80 °C for storage and subsequent use in experiments. RNA extraction, library construction, and high-throughput sequencing procedures were described previously [37].
RNA-seq analysis
Raw reads were subjected to initial filtering.The resulting clean reads were then mapped to the rice reference genomes using Hierarchical Indexing for Spliced Alignment of Transcripts 2 (HISAT2). The StringTie software was used to count the reads of all mRNAs, and FPKM was used as a measure of gene expression levels. For the differential expression analysis, the DESeq2 R package was utilized. Genes with a fold-change (FC) exceeding two (|log2(FC)|≥1) and a false discovery rate (FDR) of less than 0.05 were regarded as differentially expressed. To gain a deeper understanding of the functional implications of the differentially expressed genes, the Gene Ontology (GO) enrichment analysis was carried out using the clusterProfiler package. This package is based on the Wallenius non-central hyper-geometric distribution. Furthermore, KEGG pathway enrichment analysis was conducted using the KOBAS software.
Candidate gene identification and haplotype analysis
Based on the results of Genome-Wide Association Studies (GWAS) and the analysis of differentially expressed genes, heat-tolerance candidate genes were identified. Subsequently, the functions of these candidate genes were explored in the National Rice Data Center (https://www.ricedata.cn/). Haplotype populations with less than 10 copies were deleted. The results of haplotype analysis were combined with the heat-tolerance gene set to further screen and analyze more valuable candidate genes for heat tolerance.
Real-time quantitative PCR validation
Total RNA was extracted from 24 samples (4 varieties, 2 treatments, 3 replicates) and reverse-transcribed into cDNA. Primers were designed based on the gene sequences in NCBI using qPrimerDB(qPrimerDB - a resource for real-time quantitative PCR primers). OsActin1(LOC_Os03g61970) was selected as the internal reference gene [8]. The relative expression level was calculated using the 2^−ΔΔCt^ method [33]. All primers utilized in this experiment were documented in Table S7.
Supplementary Information
Supplementary Material 1.
Supplementary Material 2.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Ashraf H, Ghouri F, Baloch FS, Nadeem MA, Fu XL, Shahid MQ. (2024) Hybrid Rice Production: A Worldwide Review of Floral Traits and Breeding Technology, with Special Emphasis on China. PLANTS-BASEL. 2024;13(5):578.10.3390/plants 13050578 PMC 1093528738475425 · doi ↗ · pubmed ↗
- 2Chatterjee Y, Bhowal B, Gupta KJ, Pareek A, Singla-Pareek SL. (2023) Lactate dehydrogenase superfamily in rice and Arabidopsis: Understanding the molecular evolution and structural diversity. Int J Mol Sci. 2023;24(6):5900.10.3390/ijms 24065900 PMC 1005747536982973 · doi ↗ · pubmed ↗
- 3Chen JY, Zhang HW, Zhang HL, Ying JZ, Ma LY, Zhuang JY. (2018) Natural variation at q Hd 1 affects heading date acceleration at high temperatures with pleiotropism for yield traits in rice. BMC Plant Biol. 2018;18:112.10.1186/s 12870-018-1330-5PMC 599282429879910 · doi ↗ · pubmed ↗
- 4Chen L, Wang Q, Tang MY, Zhang XL, Pan YH, Yang XH et al. (2021) QTL mapping and identification of candidate genes for heat tolerance at the flowering stage in rice. Front Genet. 2020;11:62187110.3389/fgene.2020.621871 PMC 786277433552136 · doi ↗ · pubmed ↗
- 5Gu XT, Si FY, Feng ZX, Li SJ, Liang D, Yang P et al. (2023) The Os SGS 3-tasi RNA-Os ARF 3 module orchestrates abiotic-biotic stress response trade-off in rice. Nat Commun. 2023;14:4441.10.1038/s 41467-023-40176-2PMC 1036617337488129 · doi ↗ · pubmed ↗
- 6Kilasi NL, Singh J, Vallejos CE, Ye CR, Jagadish SVK, Kusolwa P et al. Heat stress tolerance in rice Oryza sativa L.): identification of quantitative trait loci and candidate genes for seedling growth under heat stress. Front Plant Sci. 2018;9:157810.3389/fpls.2018.01578 PMC 622196830443261 · doi ↗ · pubmed ↗
- 7Li DD, Liu K, Zhao CC, Liang SY, Yang J, Peng ZI et al. (2023) GWAS combined with WGCNA of transcriptome and metabolome to excavate key candidate genes for rice anaerobic germination. Rice. 2023;16:49.10.1186/s 12284-023-00667-8PMC 1061815437907655 · doi ↗ · pubmed ↗
- 8Li PP, Jiang J, Zhang GG, Miao SY, Lu JB, Qian YK. et al. Integrating GWAS and transcriptomics to identify candidate genes conferring heat tolerance in rice. Front Plant Sci. 2023;13:1102938.10.3389/fpls.2022.1102938 PMC 986856236699845 · doi ↗ · pubmed ↗
