High‐Throughput Screen of NPQ in Sorghum Shows Highly Polygenic Architecture of Photoprotection
Richard L. Vath, Samuel B. Fernandes, Brandon Monier, Katarzyna Głowacka, Julia Walter, Alexander E. Lipka, John Ferguson, Carl J. Bernacchi, Taylor Pederson, Johannes Kromdijk

TL;DR
This study explores the genetic basis of photoprotection in sorghum, revealing a complex network of genes involved in this trait.
Contribution
The study identifies 110 candidate genes for photoprotection using a combination of GWAS and TWAS in a large sorghum panel.
Findings
Broad-sense heritability for chlorophyll fluorescence parameters ranged from 0.3 to 0.65.
A complex genetic architecture with many small-effect loci was uncovered for photoprotection traits.
110 unique candidate genes were identified as being associated with photoprotection in sorghum.
Abstract
Natural genetic variation in photosynthesis and photoprotection within crop germplasm represents an untapped resource for crop improvement. Sorghum bicolor (sorghum) is one of the world's most widely grown crops, yet the genetic basis of photoprotection in sorghum is not well understood. This study examined genetic variation in non‐photochemical quenching traits by screening a field‐grown panel of 861 genetically diverse natural sorghum accessions across 2 years. Broad‐sense heritability ranged between 0.3 and 0.65 across different chlorophyll fluorescence parameters. A combination of genome‐ and transcriptome‐wide (GWAS and TWAS) identification of genetic correlates with the observed trait variation uncovered a complex genetic architecture of many significant small‐effect loci. An ensemble approach based on GWAS and TWAS results and the covariance between different fluorescence…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
FIGURE 1
FIGURE 2
FIGURE 3
FIGURE 4
FIGURE 5
FIGURE 6
FIGURE 7
FIGURE 8| Trait | Description | Mean | Min | Max | |
|---|---|---|---|---|---|
| Max NPQ | Maximum NPQ level reached during fluorescence trace | 2.80 | 2.50 | 3.10 | 0.52 |
| NPQ ind. slope | Initial linear slope of NPQ induction () | 1.29 | 0.92 | 1.72 | 0.58 |
| NPQ rel. slope | Initial linear slope of NPQ relaxation in dark () | −2.68 | −3.20 | −2.19 | 0.67 |
| NPQ ind. | Exponential rate constant of NPQ induction () | 0.58 | 0.39 | 0.84 | 0.67 |
| NPQ rel. | Exponential rate constant of NPQ relaxation in dark () | 5.09 | 3.68 | 6.87 | 0.44 |
| NPQ at final dark time point | 0.89 | 0.69 | 1.12 | 0.47 | |
| rec. | Exponential rate constant of recovery in dark () | 5.08 | 3.97 | 6.48 | 0.43 |
| Photoprotection index | 0.82 | 0.76 | 0.85 | 0.36 | |
| Maximum quantum efficiency | 0.75 | 0.73 | 0.77 | 0.30 |
| Accession | Multi‐trait Score | Maximum NPQ | NPQ rel. | |
|---|---|---|---|---|
| 5.73 | 3.10 | 5.70 | 0.84 | |
| 4.74 | 3.00 | 5.20 | 0.85 | |
| 4.70 | 2.94 | 5.62 | 0.85 | |
| 4.59 | 2.88 | 5.92 | 0.85 | |
| 4.36 | 2.91 | 5.94 | 0.84 | |
| PI522129 | 4.32 | 3.01 | 6.03 | 0.82 |
| 4.30 | 2.92 | 6.33 | 0.83 | |
| 4.29 | 3.05 | 5.05 | 0.84 | |
| PI522160 | 4.26 | 3.03 | 5.59 | 0.83 |
| PI514465 | 4.18 | 3.02 | 6.11 | 0.82 |
| Analyses overlapped | Gene | Traits | Arabidopsis description |
|---|---|---|---|
| 11 | Sobic.001G255300 | ||
| 11 | Sobic.003G418000 | NPQ rel. intercept term | 2‐Oxoglutarate (2OG) and Fe(II)‐dependent oxygenase superfamily protein |
| 11 | Sobic.006G152000 | NPQ ind. | Reticulata‐like protein, putative (DUF3411) |
| 11 | Sobic.008G142400 | NPQ ind. slope | |
| 10 | Sobic.002G277900 | rec. amplitude | PLC‐like phosphodiesterases superfamily protein |
| 10 | Sobic.009G187300 |
rec. | Allyl alcohol dehydrogenase‐like protein |
| 10 | Sobic.010G093800 | NPQ ind. | |
| 10 & 9 | Sobic.004G161800 | NPQ ind. slope & NPQ ind. | |
| 9 | Sobic.001G185600 | MATE efflux family protein | |
| 9 | Sobic.001G269000 | NPQ ind. slope | Glycosyl hydrolase family 38 protein |
| 9 | Sobic.001G496900 | NPQ ind. amplitude | DNAse I‐like superfamily protein |
| 9 | Sobic.002G021900 |
rec. | Wall‐associated kinase‐like 6 |
| 9 | Sobic.004G250300 | NPQ rel. amplitude | Eukaryotic aspartyl protease family protein |
| 9 | Sobic.006G060100 | NPQ ind. | Glutaredoxin family protein |
| 9 | Sobic.009G054900 | NPQ rel. amplitude | Ca2+‐activated RelA/spot‐like protein |
| 9 | Sobic.009G187100 | NPQ rel. |
| Traits overlapped | Gene | Arabidopsis description |
|---|---|---|
| 14 | Sobic.005G154850 | NADP‐malic enzyme 4 |
| 13 | Sobic.001G189300 | AMP‐dependent synthetase and ligase family protein |
| 12 | Sobic.001G385400 | Guanyl‐nucleotide exchange factors; GTPase binding; GTP binding protein |
| 12 | Sobic.001G308100 | |
| 12 | Sobic.009G187900 | Basic helix–loop–helix (bHLH) DNA‐binding superfamily protein |
| 11 | Sobic.002G057400 | HAUS augmin‐like complex subunit |
| 11 | Sobic.005G200800 | |
| 11 | Sobic.004G150200 | Centrosomal protein |
| 11 | Sobic.001G494400 | Myosin‐binding protein (protein of unknown function, DUF593) |
| 11 | Sobic.003G287000 | |
| 11 | Sobic.001G139100 | |
| 11 | Sobic.006G188100 | |
| 11 | Sobic.002G007800 | Transcription factor jumonji (jmj) family protein/zinc finger (C5HC2 type) family protein |
| 11 | Sobic.004G172100 | Protein kinase superfamily protein |
- —Gatsby Foundation, Lecturer startup funding
- —Frank Smart studentship, Frank Smart studentship in Botany
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Mapping and Diversity in Plants and Animals · Plant Gene Expression Analysis · Plant Molecular Biology Research
Introduction
1
Growth in global agricultural productivity is likely to be substantially diminished by projected increases in climatic variability (Lewis and King 2017; Pendergrass et al. 2017; Rahman et al. 2022; Rosenzweig et al. 2014), while demand for food, feed, and biofuel in 2050 is projected to require a 47% increase in agricultural productivity over 2011 levels (Sands et al. 2023). Improvement of crop photosynthetic efficiency has been suggested to offer a potential mitigating solution to help sustain productivity in a changing climate (Zhu et al. 2010; Long et al. 2025). Non‐photochemical quenching (NPQ), a leaf physiological process which dissipates excess absorbed light energy (photoprotection), is an important determinant of photosynthetic efficiency, and enhancements in the induction and relaxation rates of NPQ have been shown to increase yield and biomass in C_3_ broad‐leaved crops under field conditions (de Souza et al. 2022; Kromdijk et al. 2016). Natural genetic variation in photosynthetic and photoprotective traits promises great potential as a resource for crop improvement (Theeuwen et al. 2022), and variation in these traits has been shown to exist within crop and non‐crop germplasm (Ferguson, Caproni, et al. 2023; Kumari et al. 2025; Ortiz et al. 2017; Sahay et al. 2023, 2024; van Rooijen et al. 2017).
Sorghum bicolor ((L.) Moench; sorghum) is grown extensively throughout the world as a food and feed crop (Visarada and Aruna 2019) and offers high potential as a bioenergy stock (Erickson et al. 2012; Rodrigues Castro et al. 2015). Its high water use efficiency and resiliency to stressors (Hadebe et al. 2017; Maman et al. 2003) make sorghum an attractive option to potentially supplement or replace other crops in increasingly uncertain climate conditions. Sorghum's adaptability, relatively small diploid genome (~730 Mb), and extensive genetic diversity facilitate highly tractable variant studies to be conducted in diverse climates. These traits, combined with the availability of a high‐quality reference genome (McCormick et al. 2018), make sorghum an attractive resource for understanding genotype‐by‐environment (GxE) interactions in key C_4_ photosynthetic traits (Boyles et al. 2019), including NPQ. Uncovering the extent of genetic variability in photoprotective capacity and identifying genes underlying NPQ may help advance breeding efforts toward improvement of photosynthetic efficiency in sorghum.
An understanding of the extent of naturally occurring variation in photoprotection in sorghum, and identification of the genes underlying this variation could facilitate germplasm improvement via breeding, genomic editing, and transgenic approaches (Flood et al. 2011; Lawson et al. 2012; van Bezouw et al. 2019).
Genome‐ and transcriptome‐wide association studies (GWAS and TWAS, respectively) can be used to identify quantitative trait loci (QTL) and candidate genes that underlie highly polygenic photosynthetic traits (Gui et al. 2023; Tam et al. 2019; Tibbs Cortes et al. 2021) by correlating genomic marker and transcript expression variation with trait phenotype variation. Despite their demonstrated usefulness, both methods are complicated by the need to set an appropriate threshold for significance, which needs to maintain sufficient stringency to account for the multitude of parallel tests while keeping the rate of false negatives low. However, since the occurrence of false negatives and positives between both methods can be assumed independent, combining GWAS and TWAS offers a way to detect higher‐confidence QTL associated with complex crop physiological traits (Ferguson et al. 2021; Kremling et al. 2019; Lin et al. 2022; Pignon et al. 2021).
Molecular and reverse genetics‐based approaches have successfully identified genes involved in NPQ regulation in model species (Bru et al. 2020; Kasajima et al. 2011; Li et al. 2000), but larger‐scale screening of NPQ is a relatively recent pursuit. GWAS have been successful in identifying QTL associated with NPQ in Arabidopsis thaliana (Arabidopsis), Oryza sativa (rice), Glycine max (soybean), Zea mays (maize), and sorghum, both in controlled (Rungrat et al. 2019) and field (Ferguson, Caproni, et al. 2023; Herritt et al. 2016; Kumari et al. 2025; Sahay et al. 2023, 2024; Wang et al. 2017) conditions. Screening of NPQ kinetics on field‐grown plants can be cumbersome and challenging to accomplish in a manner which is high‐throughput enough for the large number of accessions required to discern small‐effect loci via GWAS and TWAS. As a result, our understanding of the genes behind natural variation in NPQ under production‐relevant conditions is still limited for several of the world's most important crops, particularly those with C_4_ photosynthetic metabolism, such as maize, Saccharum officinarum (sugarcane), and sorghum.
The current study utilizes a high‐throughput chlorophyll fluorescence screening method (Ferguson, Caproni, et al. 2023; Ferguson, Jithesh, et al. 2023; Gotarkar et al. 2022) to characterize and quantify rates of NPQ induction and relaxation in field‐grown plants of 861 biomass sorghum accessions across 2 years. Most of the measured traits showed moderately high broad‐sense heritability and their observed variation was underpinned by a complex architecture of many significant small‐effect loci. Combined GWAS and TWAS analyses uncovered 110 unique high‐confidence candidate genes, which may be used to improve understanding of NPQ regulatory regions in sorghum and other economically important, closely related C_4_ crops such as maize and sugarcane, and potentially in even more distantly related species (Sahay et al. 2023, 2024).
Materials and Methods
2
Germplasm and Field Trial Design
2.1
A sorghum panel composed of 869 genetically diverse biomass accessions (described previously in dos Santos et al. 2020; Valluru et al. 2019) was grown in 2017 and 2019 at the University of Illinois Maxwell Farm (2017; 40.055166, −88.236794) and Energy Farm (2019; 40.063333, −88.205000) Research Sites near Urbana, IL, USA. Accessions were planted in an augmented block design with 960 four‐row plots (3 m row length) arranged into 40 columns and 24 rows, in 16 blocks. The incomplete blocks were connected through six common check accessions to account for block effects during statistical analysis of phenotypic traits. This study utilized data from 855 and 846 accessions in 2017 and 2019, respectively, after filtering for availability of marker data. Together this allowed for a combined‐year analysis of 861 accessions. 839 accessions were common to both years, with three check accessions (Pacesetter, PI276801, and PI148089) common between years. The panel was planted on May 31 of both years, with five plots replanted on June 10, 2019, due to poor germination. Temperature, precipitation, and solar radiation data for both growing seasons are available in Figures S1 and S2.
Field Sampling
2.2
Sorghum accessions were screened for photoprotective traits in both years via chlorophyll fluorescence imaging. Sampling and screening were conducted using detached leaf segments (Ferguson, Jithesh, et al. 2023) via the 96‐well plate method detailed in Gotarkar et al. (2022) and Sahay et al. (2023). Plants were sampled by cutting leaf discs with a 6 mm diameter hole punch from the youngest fully expanded leaf, as indicated by ligule emergence at the time of measurement. Two samples from separate plants in the middle of both inner rows from each plot were taken, providing a total of four biological replicates per accession. To prevent sampling time differences between accessions, blocks were sampled sequentially by replicate. Hence, sampling time varied between replicates, but differences in average sampling time between genotypes were kept within 45 min (time taken to sample one block). Following collection, sample plates were wrapped in aluminum foil to prevent light intrusion and buffer temperature changes, then stored in a cooled polystyrene container while further sampling was completed. Four replicates of 128 accessions were sampled per day, between 14:30 and 18:30 local time. After all daily sampling was completed, plates were stored overnight at approximately 20°C in a temperature‐controlled lab. Sampling took place in 2017 from July 25 to 28 and August 1 to 4, and in 2019 from July 22 to 25 and July 28 to 31.
Chlorophyll Fluorescence Screening of Photoprotective Traits
2.3
Discs were imaged on the morning proceeding sampling using a fluorescence imager (CFImager, Technologica, Colchester, UK). The 96‐well plates were imaged in the order in which they were sampled the previous afternoon, to mitigate temporal bias. In a dimly lit room, foil was removed from each plate immediately before imaging, and the plate placed into the imager. The imaging routine utilized 800‐ms‐long flashes at 4000 μmol m^−2^ s^−1^ photosynthetic photon flux density (PPFD), and consisted of a dark‐adapted measurement of maximum photosystem II (PSII) quantum efficiency (Fv/Fm) followed by periodic fluorescence measurements over 10 min at 2000 μmol m^−2^ s^−1^ actinic PPFD followed by 12 min of darkness. During the NPQ induction (light) period, fluorescence parameters were measured every 20 s for the first minute, then every minute for the remainder of the light period. During the NPQ relaxation (dark) period, fluorescence parameters were measured every 20 s for the first minute, then every minute for the next 3 min, then every 3 min for the duration of the dark period (Figure 1). Image thresholding and segmentation were performed in MatLab (MATLAB (Version R2020a) 2020). Fluorescence values for each disc were taken as the median pixel value of the disc. The distribution of Fv/Fm values for the discs was examined to discern outlier leaf samples and discs with Fv/Fm values lower than 0.65 were excluded from further analyses.
Illustrative plot of NPQ induction during a high‐light treatment followed by NPQ relaxation and ΦPSII recovery during a subsequent period of darkness (styled after Kaiser et al. 2018).
Using MatLab, exponential models were fit to the NPQ traces of light induction and dark relaxation periods separately, to allow quantitative comparison of NPQ induction (Equation 1) and relaxation (Equation 2) rates, as well as the dark recovery of PSII operating efficiency (ΦPSII, Equation 3):
where aNPQi, aNPQr and aΦPSII are the amplitudes of the responses of NPQ induction, relaxation, and ΦPSII recovery during the dark period, respectively; kInd, kRel, and kRec are the rate constants of NPQ induction and relaxation and ΦPSII recovery, respectively (thus a larger number indicates faster kinetics), t is the measurement time point, and bNPQ or bΦPSII an offset to account for a non‐zero y‐intercept term. Modeled curve fits were visually examined for goodness‐of‐fit, with all traits of non‐conforming discs excluded from further analyses, resulting in the exclusion of 8% of discs in 2017 and 13% in 2019. Figure 1 provides a stylized depiction of fluorescence trace parameters.
In addition to the parameters derived from the exponential models, maximum NPQ reached during the light period and the initial linear slopes of NPQ induction/relaxation were also determined for each trace and the photoprotection index (PI), depicting the photoprotective effectiveness of NPQ, was calculated based on the method described by Ruban and Murchie (2012) and implemented as in Kromdijk et al. (2016). Briefly, PI is the ratio between observed Fv′/Fm′ (maximum PSII quantum efficiency at a given PPFD) and the calculated final Fv′/Fm′ based on the predictable decrease due to residual NPQ by the end of the 12‐min dark period. As in Ruban and Murchie (2012), PI≥ 1 suggests that the entire reduction in Fv′/Fm′ relative to initial Fv/Fm can be attributed to NPQ, while values progressively lower than one suggest a portion of the drop is attributable to photoinhibition (sustained depression of Fv′/Fm′f due to reaction center damage). PI was calculated as follows:
where Fv′/Fm′f and NPQf are Fv′/Fm′ and NPQ, respectively, at the final dark time point.
Statistical Modeling and Heritability
2.4
For each trait/year combination a residual maximum likelihood model was fit to each trait using ASReml for R (Butler et al. 2017):
where y is the vector of phenotypes; 1 (n × 1) is a vector of ones; μ is the trait mean; Z1 is the incidence matrix associated with the vector of random genotype (accession) effects g, with g∼N0σg2 where σg2 is the genetic variance; Z2 is the incidence matrix associated with the vector of random effect set s, with s∼N0σs2 where σs2 is the variance of set; Z3 is the incidence matrix associated with the vector of random effect block within set b, with b∼N0σb2 where σb2 is the variance of block within set; and e is the vector of residuals, with e∼N0σAR1xAR12 where σAR1xAR12 is the residual variance with a first‐order auto‐regressive structure applied to row and column for spatial correction.
Additionally, the 2 years' data were analyzed in a joint model:
where y is the vector of phenotypes for j environments; 1 (n × 1) is a vector of ones; μ is the trait mean; X is the incidence matrix associated with the vector of fixed effect environments t (j × 1); Z1 is the incidence matrix associated with the vector of random genotype effects within environment g, with g∼N0σg2 where σg2 is the genetic variance; Z2 is the incidence matrix associated with the vector of random effect of set within environment s, with s∼N0σs2 where σs2 is the variance of set within environment; Z3 is the incidence matrix associated with the vector of random effect block within set within environment b, with b∼N0σb2, where σb2 is the variance of block within set within environment; Z4 is the incidence matrix associated with the vector of random effect genotype interacting with environment gt, with b∼N0σgt2, where σgt2 is the variance of genotype with environment; and e is the vector of residuals, with e∼N0σAR1xAR12 where σAR1xAR12 is the residual variance with a first‐order auto‐regressive structure applied to row and column for spatial correction. The most appropriate variance–covariance structure to model the residuals was selected based on the Akaike information criterion. Outliers were filtered out based on method two of Bernal‐Vasquez et al. (2016).
Best linear unbiased predictions (BLUPs) were obtained separately for each genotype for 2017, 2019, and the combined model, resulting in data for 861 accessions to be used for genomic analysis. Generalized heritability (analogous to broad‐sense heritability) for each trait was calculated as follows:
where PEV is the prediction error variance (Cullis et al. 2006; Piepho and Möhring 2007) and σg is genetic variance.
Correlations between BLUPs were calculated and visualized with R packages psych (Revelle 2022) and ggcorrplot2 (Cai et al. 2022). A composite Multi‐trait Score was calculated for joint analysis BLUPs by summing standardized trait values for Maximum NPQ, NPQ relaxation k, and PI.
Genome‐Wide Association
2.5
The genotype data set utilized in this study was previously published in Ferguson et al. (2021). Briefly, 100,435 genotyping by sequencing single‐nucleotide polymorphisms (SNPs) available for 869 accessions (dos Santos et al. 2020) were imputed using a whole genome re‐sequencing panel with 5,512,653 SNPs and 229 accessions from Valluru et al. (2019). Imputation was done with Beagle 4.1 (Browning and Browning 2016) after filtering out SNPs with a minor allele count less than 20 and pruning SNPs in high linkage disequilibrium (LD) (r ^2^ > 0.9), using Plink (Purcell et al. 2007) with options “–indep‐pairwise 50 10 0.9”. The resulting data sets had 450,074 (2017), 450,449 (2019), and 454,087 (joint analysis) SNPs that were used for GWAS. The SNP dataset was used in TASSEL 5 (Bradbury et al. 2007) to obtain the kinship matrix and five principal components (PCs). In both cases, the default options were used. Univariate and multivariate GWAS were conducted in GEMMA (Zhou and Stephens 2012) using the Q + K model for each trait/year combination. The best number of principal components was decided based on the Bayesian information criterion.
Analysis of specific combinations of covarying traits might increase signal to noise for specific NPQ attributes of interest and potentially suggest genotypes which are particularly and uniquely responsive to photoprotection under dynamic light conditions. Thus, GWAS was performed on several sets of combined traits (CT). For multi‐trait GWAS, the following trait combinations were evaluated:
- CT1: Max NPQ, NPQ induction amplitude, NPQ induction k, NPQ relaxation k
- CT2: NPQ induction amplitude, NPQ induction k, NPQ relaxation k
- CT3: Max NPQ, NPQ induction k, NPQ relaxation k
- CT4: PI, ΦPSII recovery amplitude, ΦPSII recovery k
CT1 through CT3 are expected to amplify signal along the quickly versus slowly relaxing NPQ axis of variation, while CT4 should amplify variation along the photoprotection versus photodamage axis.
The unpruned dataset was used to calculate pairwise linkage LD, only including SNPs with an r ^2^ above 0.2, and using a sliding window maximum size of 500 kb and 99,999 SNPs with Plink options “–blocks no‐pheno‐req no‐small‐max‐span, –blocks‐max‐kb 500, –blocks‐min‐maf 0.001”. LD blocks were calculated as proposed in Gabriel et al. (2002). Linkage disequilibrium blocks and the number of SNPs and genes contained are available in Table S1. SNPs were considered to be in strong LD if the bottom 90% D‐prime confidence interval was greater than 0.70, and the top of the confidence interval was at least 0.98, for a total of 45,311 LD blocks (see Table S1) with a median size of 1.178 kb containing a median of 11 SNPs. Manhattan plots for visualization of SNP mapping were based on the “myManhattan” function (https://github.com/alfonsosaera/myManhattan) with modifications. QQ plots were produced via R CMplot (Yin et al. 2021).
Transcriptome‐Wide Association
2.6
Gene expression data (described in Ferguson et al. (2021)) from the controlled environment‐grown 229 sorghum accessions of Valluru et al. (2019) was used to analyze covariance of transcript abundance with photoprotection traits. Transcript abundance from tissue at the shoot growing point (GP) and base of the third leaf (3L) was analyzed separately for each photoprotection trait/year combination.
Before mapping, 10 hidden factors were calculated using probabilistic estimation of expression residual (PEER) factors for each tissue (Stegle et al. 2012). Five PCs were also calculated from prior genotype data. Finally, genes expressed in less than half of the individual lines were removed from each tissue set. After covariate calculation and filtering, a general linear model was fit individually for each NPQ trait and gene expression value using prior PEER factors and PCs as covariates. In addition to mapping each NPQ trait, a multitrait approach was also performed by combining several traits using methods described in the prior section. Mapping was conducted in the R environment (R Core Team 2022) using rTASSEL (Monier et al. 2022).
Combined Genome and Transcriptome‐Wide Analysis
2.7
Fisher's combined test (FCT) was performed using the GWAS and TWAS results to reduce the impact of false positives and negatives, based on the fact that the likelihood of true functional genetic variation increases dramatically when multiple types of analyses are performed and co‐analyzed (Kremling et al. 2019). For each trait/year combination, the nearest gene (by physical location) was assigned to the top 10% of GWAS SNPs by p‐value. FCT was then run on the combined top GWAS and all TWAS transcripts using the metap package for R (Dewey 2022).
Candidate Gene Selection and Investigation
2.8
Candidate genes (Table S2) were selected utilizing an ensemble approach (Kremling et al. 2019) which increases the statistical power available for determining genes associated with leaf physiological traits, which are often highly polygenic. Genes within LD blocks of the top 0.05% of GWAS SNPs (top GWAS SNPs available in Table S3) and the top 1% of genes from each TWAS (Table S4) and FCT analysis (Table S5), by p‐value, were classified as “top genes.” Fifteen total analyses were performed for each trait, comprising 2017, 2019, and joint models for GWAS and 2017, 2019, and joint models for both TWAS and FCT based on GP and 3L tissue.
Arabidopsis orthologues of sorghum genes overlapping in more than three separate analyses were identified using R package gProfiler2 (Kolberg et al. 2020). Panther (http://pantherdb.org/) statistical over‐representation analysis for biological function was performed on the Arabidopsis gene list against the Arabidopsis reference genome via rbioapi (Rezwani et al. 2022) (Table S6).
Top sorghum genes were considered candidates for photoprotection based on: (1) identification in eight or more individual analyses for a given trait and (2) identification as a top‐ranked gene in 10 or more separate traits (Figure 2; Table S7). Eight analyses and 10 traits were chosen as thresholds because the number of candidate genes overlapping multiple analyses or traits grew exponentially larger when fewer analyses or trait overlaps were considered. Additionally, several top sorghum genes were considered candidates based on manual investigation via direct searches for the sorghum gene ID in Google Scholar and based on their corresponding Arabidopsis orthologue being annotated in TAIR (Berardini et al. 2015) for light response and photoprotection‐related traits (such as xanthophyll synthesis or nonphotochemical quenching).
Workflow of sorghum candidate gene selection based on top 0.05% of GWAS and top 1% of TWAS and FCT genes overlapping multiple traits and analyses, and based on manual investigation.
Sorting intolerant from tolerant (SIFT) analysis (Ng and Henikoff 2003) was performed on candidate genes to investigate the likelihood of coding‐region SNPs' contribution to variation in protein function, utilizing SIFT scores from a sorghum panel containing 286 of the accessions used in this study (Lozano et al. 2021). Upset plots (Lex et al. 2014) were produced using the UpSetR package (Conway et al. 2017) to aid in visualization of top gene set overlaps for individual traits.
Results
3
Genetic Diversity in Sorghum Harbors Substantial Variation in Photoprotection Traits
3.1
A high‐throughput chlorophyll fluorescence screening method was employed to characterize photoprotective trait diversity in a field‐grown sorghum panel. NPQ induction and relaxation kinetics were measured in each sorghum accession via the application of a high‐light treatment followed by a dark period, allowing for a quantitative comparison of photoprotective capacity across a genetically diverse sorghum population. Substantial variation was observed among sorghum accessions in NPQ traits. The percentage difference between the lowest and highest accessions in maximum NPQ, NPQ induction rate constant, NPQ relaxation rate constant, and rate constant of ΦPSII recovery were 24%, 115%, 87%, and 63%, respectively, for joint model BLUPs of each trait (Table 1; Figure 3). BLUPs for all analyses and traits are available in Table S9.
Violin plots of variation in adjusted genotype means of NPQ trace parameters for the joint model. Internal box plot edges represent first and third quartiles. Points represent outliers beyond 1.5 times the interquartile range. The solid line within the boxes indicates the median. (a) Maximum NPQ level reached during the trace; (b) linear slope of NPQ induction; (c) linear slope of NPQ relaxation; (d) rate constant k of NPQ induction; (e) rate constant k of NPQ relaxation; (f) residual NPQ at end of dark relaxation period; (g) rate constant k of ΦPSII recovery; (h) photoprotection index; (i) Fv/Fm.
While the range of maximum NPQ remained similar in both 2017 and 2019, NPQ kinetic and light‐use efficiency traits varied considerably over the 2 years (Figure 4, Tables S10 and S11), potentially due to weather differences between both years (Figure S1) with total precipitation prior to sampling somewhat higher in 2019. NPQ induction in 2019 was slower across the panel, with a slope of 1.19 min^−1^ and k of 0.53 min^−1^, compared to a slope of 1.41 min^−1^ and k of 0.62 min^−1^ in 2017. NPQ relaxation was faster in 2019, with a median slope of −2.91 min^−1^ and k of 6.19 min^−1^, compared to a slope of −2.44 min^−1^ and k of 3.97 min^−1^ in 2017. The rate of ΦPSII recovery was also faster in 2019, with a median value of 5.66 min^−1^ compared to 4.47 min^−1^ in 2017. These differences may relate to inter‐year variation in susceptibility to photoinhibition during the light treatment, as the median PI in 2019 was 0.85 compared to 0.79 in 2017. Pairwise scatterplots of 2017 and 2019 BLUPs are shown in Figure 5. Accession values for all traits were weakly to moderately correlated between years with Pearson's correlation (r) values ranging between 0.21 and 0.45, but correlations were strongly significant in all cases, suggesting that accession ranks were generally maintained despite fairly strong genotype‐environment interaction effects.
Violin plots of variation in adjusted genotype means of NPQ trace parameters for 2017 and 2019 field seasons. Internal box plot edges represent first and third quartiles. Points represent outliers beyond 1.5 times the interquartile range. The solid line within the boxes indicates the median. (a) Maximum NPQ level reached during the trace; (b) linear slope of NPQ induction; (c) linear slope of NPQ relaxation; (d) rate constant k of NPQ induction; (e) rate constant k of NPQ relaxation; (f) residual NPQ at end of dark relaxation period; (g) rate constant k of ΦPSII recovery; (h) photoprotection index; (i) Fv/Fm.
Relationships of 2017 and 2019 BLUPs including Pearson's r and p‐value. Linear regression line shown in brown. The dashed black line is the 1:1 line.
Genotypes with favorable combinations of NPQ trait values could be useful in efforts to breed for improved NPQ and photosynthetic efficiency in sorghum. A composite Multi‐trait Score was computed for all joint analysis BLUPs based on trait values for maximum NPQ, NPQ relaxation time constant, and PI, where higher values for all traits are considered advantageous in a field growth environment. The top 10 genotypes by Multi‐trait Score are shown in Table 2 (values for all traits shown in Table S12, and position in trait space in Figure S26).
Heritability (H ^2^) of NPQ traits was moderate to moderately high (Figure 6), with joint model H ^2^ for maximum NPQ at 0.52, NPQ induction and relaxation slopes at 0.58 and 0.67, respectively, and NPQ induction and relaxation k at 0.67 and 0.44, respectively. Light use efficiency traits and PI were less heritable, with joint BLUP H ^2^ values of 0.43 and 0.30 for ΦPSII recovery k and Fv/Fm, respectively, and 0.36 for PI. Values of H ^2^ for 2017 and 2019 models were consistent across years for most traits, excepting PI and Fv/Fm which were notably higher in 2019 (0.60 for PI and 0.45 for Fv/Fm) than in 2017 (0.47 for PI and 0.37 for Fv/Fm).
Bar plots of heritabilities of sorghum photoprotective traits calculated from 2017, 2019, and joint analysis BLUPs. ΦPSII rec. k, exponential rate constant of ΦPSII recovery in dark; Fv/Fm, maximum PSII quantum efficiency; Max NPQ, maximum NPQ level reached during the trace; NPQ ind./rel. k, exponential rate constants of NPQ light induction and dark relaxation, respectively; NPQ ind./rel. slopes, initial linear slopes of NPQ induction and relaxation, respectively; NPQf, NPQ at final dark time point; PI, photoprotective index.
Correlations between traits were consistent between BLUPs estimated for each year (Figure 7b,c) and between joint model BLUPs (Figure 7a). Maximum NPQ tended to correlate with slower NPQ induction and faster relaxation kinetics: Significant (p < 0.05) correlations were observed between max NPQ and NPQ induction k (Pearson's r = −0.33, joint model), initial relaxation slope (r = −0.74; joint model), and relaxation k (r = 0.13; joint model). NPQ relaxation k was also strongly positively correlated with Φ PSII recovery k (r = 0.85; joint model), suggesting the short high light treatment did not cause substantial reaction center damage which might otherwise disrupt the expected strong relationship between Φ PSII and NPQ. As could be expected, initial slopes of NPQ induction and relaxation were significantly correlated with their corresponding rate constants k, at r = 0.88 for induction and r = −0.48 for relaxation (joint model). Intriguingly, accessions with a higher NPQ induction k tended to have a slower (less negative) initial relaxation slope (r = 0.25), both of which would enhance protection against photoinhibition. Accessions with lower PI (more photoinhibited) tended to exhibit a higher NPQf (r = −0.72), suggesting that the final NPQ level determined at the end of the imaging assay also reflects a degree of photoinhibition sustained during the light treatment.
Correlograms demonstrating Pearson's correlations between BLUPs of photoprotection traits of 839 sorghum accessions. The color and size of each square represent the correlation coefficient of the pairwise interaction. Squares marked with an asterisk denote Holm‐method corrected p‐values < 0.05. (a) Joint model. (b) 2017 model. (c) 2019 model. ΦPSII rec. k, exponential rate constant of ΦPSII recovery in dark; Fv/Fm, maximum PSII quantum efficiency; Max NPQ, maximum NPQ level reached during the trace; NPQ ind./ rel. k, exponential rate constants of NPQ light induction and dark relaxation, respectively; NPQ ind./rel. slopes, initial linear slopes of NPQ induction and relaxation, respectively; NPQf, NPQ at final dark time point; PI, photoprotective index.
Genome‐ and Transcriptome‐Wide Analyses Uncover Genes Associated With Photoprotection in Sorghum
3.2
Marker‐trait association (MTA) analyses were performed using the BLUPs from 2017, 2019, and combined year (joint) models to identify sorghum genomic regions associated with variation in photoprotective traits. For each model/year combination, MTA analyses were conducted using the sorghum SNP set (GWAS) and transcript expression data (TWAS). Genes in LD with the top 0.05% of GWAS SNPs (Table S3) and the top 1% of TWAS genes (Table S4), as ranked by Bonferroni‐adjusted p‐value, were brought forward as “top” genes. Additionally, genes in LD with the top 10% of GWAS SNPs and their corresponding transcript expression levels were analyzed for covariance with the NPQ traits via Fisher's Combined Test (FCT), with the top 1% of genes by Bonferroni‐adjusted p‐value brought forward as top FCT results (Table S5). Corresponding chromosome mapping plots for all MTA analyses are provided in Figures S3–S18, QQ plots and Z‐score histograms are shown in Figures S19–S22 and S23–S25, respectively.
An intersection‐based approach was used to produce a list of candidate genes that control sorghum photoprotection traits. As an example for a single trait, maximum NPQ may be reasonably representative of an accession's photoprotective capacity, particularly as it tended to correlate significantly with both NPQ induction and relaxation k. The joint model GWAS for maximum NPQ identified four SNPs with an FDR‐adjusted p‐value below 0.05, spanning a region of less than 20 kilobases within two LD blocks containing 74 genes (Figure 8a). When combined with TWAS results (Figure 8b,c) for the joint analysis, the FCT tests in both GP and 3L tissues (Figure 8d,e) strengthened this association with several genes in the same locus comprising some of the highest‐confidence genes in the analysis. This region was also highly enriched in the joint FCT of CT1, the multivariate combination of maximum NPQ, NPQ induction amplitude, and NPQ induction and relaxation k. A number of top genes overlapped in multiple joint‐model maximum NPQ analyses (Figure 8f), with five genes overlapping in three separate analyses and 93 genes overlapping in two separate analyses. Seventy genes, overlapping in eight or more model analyses for individual traits, were considered progressively higher confidence based on total number of overlaps. The 16 genes overlapping in nine or more analyses are summarized in Table 3.
Chromosome mapping (physical location) for SNPs and genes associated with maximum NPQ combined 2017/2019 adjusted means (a–e) and upset plot (f) showing the number of overlapping genes between the top 1% of hits in FCT 3L and GP, TWAS 3L and GP, and top 0.05% of hits in GWAS analysis, for maximum NPQ (joint model). (a) GWAS; (b) TWAS in GP tissue; (c) TWAS in 3L tissue; (d) Fisher's combined test from GP tissue; (e) Fisher's combined test from 3L tissue; Blue lines indicate the threshold of SNPs in top 0.05% (a) or genes in top 1% (a–e) of −log10 p values. SNPs in plot A with an FDR‐adjusted p‐value < 0.05 are highlighted in red. TWAS and FCT gene positions are plotted as midpoint of each gene.
Many top genes were common to multiple traits, with nearly 30% of top genes overlapping three or more trait top gene lists. The 37 genes which overlapped 10 or more traits were considered candidates; Table 4 summarizes the highest‐confidence portion of those. The multi‐model/single‐trait and multi‐trait overlap thresholds together resulted in 104 unique candidates (Table S2). An additional six unique genes were recorded as candidates based on overlapping associations with three or more traits and being either annotated in TAIR for light‐response‐related function or noted as carotenoid biosynthesis prior in Ortiz et al. (2017).
Several of the 104 candidates are orthologous to Arabidopsis genes which, based on their annotation, could contribute directly or indirectly to photosynthetic efficiency and photoprotective traits in dynamic light conditions. Sobic.006G152000 overlapped in 11 separate analyses for NPQ induction slope and k, eight analyses for CT3 (max NPQ, NPQ induction k, and NPQ relaxation k), and several analyses for the other combined traits. The Arabidopsis orthologue (LCD1) of this gene is involved in palisade mesophyll cell density (Barth and Conklin 2003). Sobic.003G418000, which overlapped 11 analyses for the NPQ relaxation intercept term, has been suggested to alter strigolactone biosynthesis in response to parasite stress in sorghum (Bellis et al. 2020), and is orthologous to Arabidopsis LBO1, involved in strigolactone biosynthesis (Brewer et al. 2016). Similarly, Sobic.006G060100 (overlaps nine analyses for NPQ induction k) is orthologous to Arabidopsis AT5G58530, a glutaredoxin (GRX)‐like protein.
Sobic.009G187300, with 10 overlaps for NPQ relaxation rate constant and the correlated ΦPSII recovery rate constant, is an orthologue of Arabidopsis AT3G59840, an allyl alcohol dehydrogenase‐like protein which has been implicated in regulation of ATP synthase activation/deactivation kinetics (Gong et al. 2006). Sobic.001G185600, overlapping nine analyses for NPQf, is orthologues to a MATE efflux family protein in Arabidopsis (AT1G11670) which has been associated with light‐induced regulation of gene expression during etiolation, potentially playing an early role in preparing Arabidopsis seedlings for photosynthesis (Hudson et al. 2003).
Of the multiple‐trait overlap candidates, Sobic.005G154850 was a top gene in 14 traits analyzed, predicted as an NADP‐ME isoform (not apparently involved in the C4 carbon concentrating mechanism) and orthologous to AT1G79750 (NADP‐ME4). Sobic.001G189300, orthologous to an AMP‐dependent synthase and ligase family protein, overlaps 13 top gene sets. This jasmonic acid precursor has been implicated in stress response in Arabidopsis (Bonsegna et al. 2005). Notably, Sobic.006G128300 overlapped as a top gene in seven separate analyses and is orthologous with the Arabidopsis NPQ6 gene, a YCF‐20 family gene involved in energy‐dependent quenching (Jung and Niyogi 2010). Additionally, Sobic.008G021000 was found to overlap four top gene analyses for both NPQ relaxation k and ΦPSII recovery k. This gene was also identified in a genetic screen for fluorescence parameters in sorghum (Ortiz et al. 2017) and annotated as PHYTOCHROME INTERACTING FACTOR 4 (PIF4), a transcriptional regulator of carotenoid biosynthesis (Ruiz‐Sola et al. 2014).
SIFT analyses of coding sequence SNPs within candidate genes were performed to uncover potential causal SNPs which may play a part in photoprotective regulation (Table S8). Of the 110 candidate genes, 99 contained at least one predicted nonsynonymous substitution (encoding a different amino acid than the reference allele), providing one possible causal mechanism underlying the observed marker‐trait associations.
Discussion
4
Fine‐tuning photoprotection to better match prevailing light conditions could contribute toward improvement in photosynthetic efficiency in the crops that feed and fuel the world (Zhu et al. 2010). Here, variation in NPQ in a large diversity panel was screened during two field seasons, confirming significant heritable variation in NPQ resides within the sorghum genome. Subsequent genome‐ and transcriptome‐wide analyses identified a number of candidate loci underlying genetic control of NPQ in one of the world's most stress‐resilient crops.
The Sorghum Genome Exhibits Wide‐Ranging and Heritable Capacity for NPQ
4.1
The variation in photoprotective traits observed within this sorghum population adds to a growing body of evidence that significant intraspecies variability in photoprotective capacity exists in C_4_ crops. BLUPs of maximum NPQ measured during the 10‐min high light treatment varied from just above 2.4 up to more than 3.2, a somewhat smaller range than that observed in field‐grown sorghum and maize panels subjected to similar light treatment conditions (Ferguson et al. 2025; Sahay et al. 2023, 2024) but similar to that found in a panel of 529 diverse rice accessions investigated by Wang et al. (2017), who reported mean NPQ values of 2.4–3.0 measured in the field after imposition of 5 min of 1000 μmol m^−2^ s^−1^ PPFD. Additional studies in rice diversity panels also noted genetic variation in NPQ expressed either as a quantum yield (Quero et al. 2021) or as NPQ_t_ (Wei et al. 2022), a simplified NPQ parameter that does not require dark‐adaptation. Herritt et al. (2016) reported differences in NPQ between accessions of a diverse soybean panel, derived via measurements of photochemical reflectance index, which is correlated with the energy‐dependent qE component of NPQ. NPQ determined from high‐throughput fluorescence imaging of Arabidopsis rosettes ranged from 1.5 to 4 (non‐averaged values) in contrasting accessions after 8 min at 1000 μmol m^−2^ s^−1^ PPFD (Rungrat et al. 2019). This is similar to the quality‐filtered unaveraged sorghum NPQ values in the current study which ranged from 1.0 to 4.3. While actinic light intensities and treatment timing differ between these studies, it is clear that variation in NPQ exists within single species, including economically important crops.
NPQ, like many photosynthetic traits, is developmentally and environmentally plastic (Bielczynski et al. 2017). Accordingly, the distribution of several trait values in this study shifted between 2017 and 2019 (Figure 4), though notably maximum NPQ was more correlated between both years than most of the other NPQ parameters. The faster NPQ induction and slower NPQ relaxation rates, and higher residual NPQf in 2017 compared to 2019 suggest a “more protected” state in 2017. Annual solar radiation differences may explain part of the difference (Figure S3; Illinois Climate Network 2025), as cumulative irradiation during the period from planting through sampling was 13% higher in 2017 than in 2019. The differences between years may also be explained in part by less frequent and lower total precipitation during the early part of the 2017 growing season (Figure S1). Though the difference in total precipitation from May through July was not particularly large between the 2 years (approximately 38.6 mm less in 2017 than 2019), precipitation in the months prior to planting was lower in 2017 than 2019. Annual cumulative precipitation was 19% and 9% below the 20‐year annual mean in 2017 and 2019, respectively (NOAA Online Weather Data 2025). In a similarly diverse multi‐year sorghum NPQ screen, Sahay et al. (2024) observed stronger correlations in NPQ kinetics between years than between low and high nitrogen treatments imposed within years, providing further evidence of the environmental plasticity of NPQ in the field. In the current study, the resulting reduction in water availability, combined with higher irradiation, may have reduced downstream photosynthetic capacity and “primed” photoprotective capacity, leading to upregulation of NPQ responses in 2017. Indeed, the depressed PI and Fv/Fm values measured in 2017 suggest a higher level of photoinhibition, despite the increased induction of sustained NPQ. An additional year's trial would aid in more thoroughly disentangling environmental effects from true genetic variation. Despite the environmental plasticity, the medium‐to‐high levels of heritability found for max NPQ and induction and relaxation rate constants in this and other studies, and significant correlations between different years, facilitated the use of GWAS and TWAS to determine underlying causal genes.
The strong positive correlation observed here between BLUPs for NPQ relaxation k and Φ PSII recovery k indicates that these two traits may be genetically and mechanistically linked; similar conclusions based on overlapping candidate genes for NPQ and PSII traits, associated with control of photosynthetic efficiency, support this linkage (Sahay et al. 2023). Faster NPQ relaxation rate constants were positively correlated with lower photoinhibition during the light treatment, as evident from the positive, significant correlation between PI and NPQ relaxation k in 2017. Arabidopsis mutants with faster NPQ relaxation kinetics during a similar length of treatment showed similar results (Li et al. 2002), though the particularly high maximum NPQ reached by PsbS overexpression mutants in that study may also have played a role in photoinhibition avoidance. In the current study, maximum NPQ reached was not correlated with PI, suggesting that during this short‐term high light treatment, NPQ induction kinetics were a larger determinant of potential photoinhibition than the actual level of NPQ attainable by a given accession.
Sorghum accessions displaying consistently favorable NPQ trait performance across environments may provide a useful resource for sorghum breeding toward improved photosynthetic efficiency. Despite a notable amount of variability in trait values between years, accession rank order tended to be maintained (Figure 5); as such, composite Multi‐trait Scores were calculated to provide a potential list of candidate genotypes that may be of interest for NPQ trait‐related breeding (Table 2). The top‐performing genotypes are fairly diverse within the genetic trait space (Figure S23), which suggests that accessions with desirable NPQ characteristics could be introduced into a breeding pool without great concern for genetic bottlenecking.
Genes Related to Stress Response and Photosynthetic Control Are Newly Associated With NPQ Traits
4.2
To increase confidence in identifying loci with genes that likely play a role in controlling photoprotection in sorghum, we employed an ensemble approach combining GWAS, TWAS, and FCT analyses as well as leveraging the covariance between fluorescence parameters. Genes which overlapped in eight or more analyses for a given trait, or were top genes for 11 or more traits, were considered candidate genes. Several of the candidates are orthologous to Arabidopsis genes annotated for light use or photosynthesis‐related processes. The Arabidopsis LCD1 gene, similar to Sobic.006G152000, is involved in palisade mesophyll cell density, and mutant phenotypes exhibit sensitivity to growth light conditions, though without an apparent response to short‐term high‐light stress (Barth and Conklin 2003). If the gene function is conserved in sorghum, modulation of cell density may affect light absorption and photosynthetic capacity, both of which could impact NPQ. Sobic.003G418000 is involved in strigolactone biosynthesis (Bellis et al. 2020). Strigolactones regulate plant growth responses to suboptimal conditions and share a biosynthesis pathway with carotenoids involved in photoprotection and antioxidant defense (Hirschberg 2001). Recently, Thula et al. (2022) also observed a direct role of strigolactones in influencing high light tolerance of photosynthesis in Arabidopsis. Sobic.006G060100 is orthologous to AT5G58530 in Arabidopsis, a GRX‐like protein which could be involved in abiotic stress responses via redox regulation and antioxidant capacity (Rouhier et al. 2008). Sahay et al. (2023) also uncovered multiple thioredoxin candidate genes associated with NPQ in maize. It is not surprising to find that redox metabolism and antioxidant scavenging are deeply intertwined with both short and longer‐term photoprotective processes (Foyer et al. 2012; Müller‐Moulé et al. 2002), with overlap in mechanisms as well as precursor molecules. Variability in dynamic photoprotective responses due to underlying variation in antioxidant scavenging may therefore be manifested here.
Interestingly, Sobic.008G142400, previously identified as a triterpenoid synthesis gene involved in cuticular wax formation (Busta et al. 2021), was a strong candidate gene in this study overlapping 11 separate analyses. While Sahay et al. (2024) observed only seven of the same candidate genes in their sorghum NPQ screen as found here, notably, Sobic.008G142400 was one of the common candidates. Correlations between cuticular wax and NPQ (and other photosynthetic traits) have been previously noted in other species but with rather inconsistent results (Li et al. 2023; Vijayaraghavareddy et al. 2022). This suggests that a mechanistic study between cuticular wax and photoprotection may be warranted to specifically investigate the degree to which photoprotective capacity is altered by cuticular wax coverage, or vice versa. As wax amount and composition have been shown to be heritable traits (Haque et al. 1992; Tassone et al. 2016), optimization of these traits for specific crop environments might provide another potential route toward improving photoprotective and photosynthetic efficiency.
Sobic.009G187300 is orthologous to AT3G59840, an allyl alcohol dehydrogenase‐like protein, which plays a role in activation and deactivation of ATP synthase (Gong et al. 2006). Variation in ATP synthase activation kinetics may have affected NPQ via effects on the chloroplast stroma/thylakoid lumen pH gradient, given the role of lumenal pH on PsbS and xanthophyll cycle‐dependent NPQ (X.‐P. Li et al. 2000). Sobic.001G269000, orthologue of Arabidopsis AT5G66150, overlapped nine analyses for NPQ induction slope. The gene is linked to N‐glycosylation modification in Arabidopsis (Strasser et al. 2006); though N‐glycosylation ubiquitously affects protein biogenesis and function in plants (Strasser 2022), mutants in N‐glycosylation have shown impairment in NPQ capacity and quantum efficiency (Jiao et al. 2020).
Of the genes overlapping a high number of traits, Sobic.005G154850 is an interesting candidate, orthologous to several NADP‐ME isoforms in Arabidopsis which have been implicated in response to diverse stressors and in developmental plasticity (Badia et al. 2015; Voll et al. 2012). One such orthologue is the fatty‐acid oxidation pathway NADP‐ME4 associated with photomorphogenesis in the ancestral C_3_ (Ma et al. 2002); variability in this enzyme could conceivably be responsible for developmental plasticity in light harvesting and photoprotective capacity. This locus was recently found in a diverse sorghum panel to be associated with growth rate and plant height (Panelo et al. 2024). Sobic.005G154850 is a truncated protein with a large mis‐sense section at the 3′ end, but it is unclear if this could have affected NADP‐ME activity directly involved in the C_4_ decarboxylation cycle. Sobic.001G189300 is orthologous to a jasmonic acid precursor and overlapped 13 top gene lists. Jasmonic acid (JA) and its precursors have a well‐defined role in antioxidant stress response (Munné‐Bosch 2005; Yamauchi and Matsushita 1979); within‐population variance in JA production and activity could quite reasonably underlie variability in the photoprotection traits measured here. Nearly 40 candidate genes spread across several chromosomes overlapped 10 or more individual NPQ and PSII kinetic traits, suggesting that variation in single loci can functionally affect multiple aspects of photoprotection and photochemistry, a pattern also observed in Sahay et al. (2024).
The previously discussed genes are mostly novel candidate genes associated with variation in sorghum photosynthetic efficiency in dynamic light conditions. While large‐scale quantitative genomic studies like this are not causal evidence for functional regulation, they shed light on genomic regions that have strong associations with the traits of interest, providing a valuable resource for future validation studies. In addition to the candidates listed above, several candidate genes were found to be orthologous to Arabidopsis genes previously associated with photoprotection. An orthologue to Arabidopsis chloroplast lipocalin LCNP (Sobic.006G222700) was identified in the overlap between 6 different analyses for three separate traits including NPQ relaxation slope and NPQf. The gene product from the Arabidopsis ortholog is involved in sustained NPQ (qH; Malnoë et al. 2017), and its association with residual NPQ after light treatment in this study suggests it may function similarly in sorghum. Sobic.006G128300 overlapped seven trait top gene lists and is orthologous to AT5G43050, a chloroplast‐localized YCF20 gene sharing 70% sequence similarity (Berardini et al. 2015). SIFT analysis found two nonsynonymous SNPs in the coding sequence of this gene in Sorghum, which is especially interesting since an Arabidopsis mutant carrying a single base‐pair deletion within this gene (mutant NPQ6) exhibits reduced NPQ (Jung and Niyogi 2010). SIFT analysis across the 110 candidate genes showed that SNPs within coding regions of 90% of the identified candidates were predicted to contain nonsynonymous substitutions, which provides a potential mechanism which could give rise to the observed phenotypic variation. A similarly high prevalence of nonsynonymous substitutions was found across all genes in a comparative study of deleterious mutations in sorghum and maize (Lozano et al. 2021); the high percentage of nonsynonymous substitutions in photoprotection‐associated genes may provide a potential mechanism for some of the heritable diversity in photoprotection within this diverse sorghum population. Given the wide‐ranging climatic origins of the panel (Ferguson et al. 2021), nonsynonymous coding region variation in candidate gene sequences may reflect environmental adaptation of NPQ characteristics. The complex and diverse genetic architecture underlying dynamic photoprotective traits, along with the measured trait heritability, suggests that informed genomic selection could affect meaningful changes in NPQ phenotypes in sorghum.
Conclusion
5
This work characterized the extent of heritable variation in photoprotective traits in sorghum using the largest field‐grown diversity panel for these parameters to date. The results highlight the complexity of genetic control of photoprotection in this species, with a wide range of significant small effect loci associated with the fluorescence parameters. Using an ensemble approach combining GWAS and TWAS and leveraging the covariance between fluorescence parameters, several novel high‐confidence candidates were identified which may exert genetic control of photosynthesis and photoprotection. This information can be used to inform further, targeted experiments via mutant or genetic modification studies to manipulate NPQ in sorghum, and potentially if combined with appropriate marker development could be used in marker or genomics‐assisted breeding for photosynthetic efficiency in sorghum and related C_4_ crops. For instance, modification of candidate genes such as Sobic.009G187300, involved in ATP synthase activation, may affect NPQ kinetics through changes in redox state. Or, this may represent a target for photosynthetic efficiency improvement through hastening of photosynthetic induction, which has been observed previously to be slower in C_4_ species (Arce Cubas et al. 2023). Modifications to ATP synthase kinetics could have complex effects on upstream electron transport and NPQ, but specific engineering of ATP synthase has the potential to confer flexibility in response to dynamic light conditions (Yamamoto et al. 2023). The genes Sobic.003G418000 (strigaloactone biosynthesis) and Sobic.006G060100 (GRX‐like protein encoding) may also offer potential targets given their role in antioxidant and redox regulation, particularly in plants already experiencing stressors such as drought or heat. Most of the identified candidate genes were not previously annotated with direct functions in photoprotection, but several of the candidates are associated with adjacent processes that are indirectly linked to photosynthesis and photoprotection. While germplasm‐focused enhancements in crop productivity will require a whole‐plant focus (Lopes et al. 2011), better understanding genetic control over photoprotective capacity in challenging stress conditions could play a role in development of crop germplasm better‐suited to anticipated hotter, drier conditions across many of the global crop growing regions.
Funding
This work was supported by funding from the Frank Smart Studentship in Botany to R.L.V. and Gatsby Foundation startup funding to J.K.
Conflicts of Interest
The authors declare no conflicts of interest.
Supporting information
Figure S1: Plots of maximum daily temperature and total daily precipitation recorded at the Willard Airport weather station (Savoy, IL, USA) during the 2017 and 2019 sorghum panel growing seasons. Figure S2: Plots of daily accumulated solar irradiation recorded at the Illinois State Water Research Center (Champaign, IL, USA) during the 2017 and 2019 sorghum panel growing seasons. Figures S3–S6: Chromosome mapping (Manhattan) plots for single‐nucleotide polymorphisms associated with NPQ and combined NPQ traits in 2017, 2019, and joint genome‐wide association study analyses. Figures S7–S12: Chromosome mapping (Manhattan) plots for genes associated with NPQ and combined NPQ traits in 2017, 2019, and joint transcriptome‐wide association study analyses in third leaf and growing point tissue. Figures S13–S18: Chromosome mapping (Manhattan) plots for genes associated with NPQ and combined NPQ traits in 2017, 2019, and joint Fisher's combined test analyses in third leaf and growing point tissue. Figures S19–S22: QQ plots of GWAS SNPs for 2017, 2019, and joint NPQ traits. Figures S23–S25: Signed Z‐score histograms of univariate genome‐wide association analyses, faceted by trait. Figure S26: Biplot of first two principle components of GWAS SNP set with top Multi‐trait Score accessions highlighted.
Table S1: Linkage disequilibrium (LD) blocks containing top 0.05% of SNPs from GWAS. Table S3: Summary of candidate genes. Table S4: Top 1% of genes identified via TWAS for each trait, tissue, and model year. Table S5: Top 1% of genes identified via Fisher's combined test for each trait and model year. Table S6: Gene ontology (GO) biological process enrichment analysis of Arabidopsis thaliana orthologues of the top sorghum genes which overlapped in three or more analyses. Table S7: Summary of genes in top results for FCT 3L, FCT GP, TWAS 3L, TWAS GP (top 1%), and GWAS (genes in LD with top 0.05% of SNPs), for each analysis. Table S8: SIFT scores of coding sequence single‐nucleotide polymorphisms SNPs within candidate genes. Table S9: BLUPs of all traits for each accession and model year. Table S10: 2017 NPQ trait values for individual leaf discs. Table S11: 2019 NPQ trait values for individual leaf discs. Table S12: NPQ trait values for top 10 sorghum accessions by Multi‐trait Score.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Arce Cubas, L. , R. L. Vath , E. L. Bernardo , C. R. G. Sales , A. C. Burnett , and J. Kromdijk . 2023. “Activation of CO 2 Assimilation During Photosynthetic Induction Is Slower in C 4 Than in C 3 Photosynthesis in Three Phylogenetically Controlled Experiments.” Frontiers in Plant Science 13: 1091115. 10.3389/fpls.2022.1091115.36684779 PMC 9848656 · doi ↗ · pubmed ↗
- 2Badia, M. B. , C. L. Arias , M. A. Tronconi , et al. 2015. “Enhanced Cytosolic NADP‐ME 2 Activity in A. thaliana Affects Plant Development, Stress Tolerance and Specific Diurnal and Nocturnal Cellular Processes.” Plant Science 240: 193–203. 10.1016/j.plantsci.2015.09.015.26475199 · doi ↗ · pubmed ↗
- 3Barth, C. , and P. L. Conklin . 2003. “The Lower Cell Density of Leaf Parenchyma in the Arabidopsis thaliana Mutant Lcd 1‐1 Is Associated With Increased Sensitivity to Ozone and Virulent Pseudomonas syringae .” Plant Journal 35, no. 2: 206–218. 10.1046/j.1365-313X.2003.01795.x.12848826 · doi ↗ · pubmed ↗
- 4Bellis, E. S. , E. A. Kelly , C. M. Lorts , et al. 2020. “Genomics of Sorghum Local Adaptation to a Parasitic Plant.” Proceedings of the National Academy of Sciences 117, no. 8: 4243–4251. 10.1073/pnas.1908707117.PMC 704915332047036 · doi ↗ · pubmed ↗
- 5Berardini, T. Z. , L. Reiser , D. Li , et al. 2015. “The Arabidopsis Information Resource: Making and Mining the ‘Gold Standard’ Annotated Reference Plant Genome.” Genesis 53, no. 8: 474–485. 10.1002/dvg.22877.26201819 PMC 4545719 · doi ↗ · pubmed ↗
- 6Bernal‐Vasquez, A.‐M. , H. F. Utz , and H.‐P. Piepho . 2016. “Outlier Detection Methods for Generalized Lattices: A Case Study on the Transition From ANOVA to REML.” Theoretical and Applied Genetics 129, no. 4: 787–804. 10.1007/s 00122-016-2666-6.26883044 · doi ↗ · pubmed ↗
- 7Bielczynski, L. W. , M. K. Łącki , I. Hoefnagels , A. Gambin , and R. Croce . 2017. “Leaf and Plant Age Affects Photosynthetic Performance and Photoprotective Capacity.” Plant Physiology 175, no. 4: 1634–1648. 10.1104/pp.17.00904.29018097 PMC 5717728 · doi ↗ · pubmed ↗
- 8Bonsegna, S. , S. P. Slocombe , L. Bellis , and A. Baker . 2005. “At LACS 7 Interacts With the TPR Domains of the PTS 1 Receptor PEX 5.” Archives of Biochemistry and Biophysics 443, no. 1: 74–81. 10.1016/j.abb.2005.09.003.16256065 · doi ↗ · pubmed ↗
