A comprehensive landscape of the Gossypium arboreum circRNAome under multiple abiotic stresses
Liangdan Fei, Yifan Zhou, Tao Huang, Kun Wang

TL;DR
This study maps circular RNAs in cotton under stress, revealing their role in plant resilience and highlighting chloroplast-derived circRNAs as key players.
Contribution
The study provides a high-resolution circRNA atlas in cotton and identifies psbA-derived circRNAs as potential targets for crop improvement.
Findings
Over 4,365 high-confidence circRNAs were identified in cotton under abiotic stress.
Chloroplast-derived circRNAs dominate, especially from the psbA gene, suggesting a specialized regulatory mechanism.
A circRNA-miRNA-mRNA regulatory network was constructed, centered on cotton-specific miRNAs.
Abstract
Circular RNAs (circRNAs) serve as key post-transcriptional regulators in plant stress adaptation. Here, we comprehensively characterize the circRNA landscape of cotton (Gossypium arboreum) under multiple abiotic stresses condition using RNase R-enhanced sequencing. Through a stringent KNIFE-based algorithm pipeline, we identified 4,365 high-confidence circRNAs. Mechanistically, circRNA biogenesis was associated with long flanking introns and exhibited complex patterns of alternative splicing, revealing conserved production propensities but dynamic splice sites across species. Nuclear circRNAs frequently exhibited expression patterns decoupled from their host genes, typically in a stress-specific manner. By integrating miRNA-seq data, we constructed a circRNA-miRNA-mRNA regulatory network and found it centered mainly on cotton-specific miRNAs. Notably, we discovered an extraordinary…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCircular RNAs in diseases · Plant Molecular Biology Research · RNA Research and Splicing
Introduction
1
Cotton is a fundamental industrial raw material for the global textile industry and a strategic resource vital to China’s national economy. As the world’s premier natural fiber crop, it accounts for approximately 20% of global fiber production and supports a multi-billions dollar value chain (Textile Exchange, 2024; OECD/FAO, 2024). However, the developmental period from flowering to boll opening is protracted, rendering the crop highly susceptible to drought and high temperatures, which can lead to a significant reduction in both yield and fiber strength (Sarwar et al., 2018; Zafar et al., 2023). Xinjiang, China’s primary cotton growing region in China, faces unique climatic challenges. Despite historical “warm-wetting” observations, recent evidence identifies a critical wet-to-dry shift (Yao et al., 2021) and intensifying hydroclimatic instability driven by extreme evaporation (Wang et al., 2025), posing persistent threats to the region’s agricultural resilience (Xu et al., 2025). This escalating climatic risks has been identified as a primary driver of yield fluctuations, severely constraining both cotton yield and fiber quality (Li et al., 2020). These multifaceted abiotic stressors—including heat, cold, salt, and UV radiation—pose significant threats to sustainable cotton production (Patil et al., 2024). Therefore, a comprehensive understanding of the molecular mechanisms underlying cotton stress resistance is crucial for the genetic improvement of climate-resilient and high-yield cultivars capable of withstanding these complex environmental challenges.
The emergence of non-coding RNAs (ncRNAs) offers a novel perspective for dissecting the mechanisms underlying crop stress resistance. In G. arboreum, previous systematic identification and high-resolution annotation of mRNAs and long non-coding RNAs (lncRNAs) have unveiled the complex regulatory landscapes governing cotton development and environmental adaptation (Wang et al., 2019a; Zheng et al., 2020). These datasets provide a robust foundation for exploring deeper regulatory layers, such as circular RNAs (circRNAs). Characterized by their covalently closed continuous loops lacking 5’ caps and 3’ poly(A) tails, circRNAs represent a widespread and diverse class of eukaryotic ncRNAs with essential roles in cellular homeostasis (Chen, 2016). Beyond their structural stability, circRNAs play pivotal roles in diverse biological processes and gene expression regulation, suggesting they are functional components of the transcriptome rather than mere splicing by-products (Hansen et al., 2013; Jeck et al., 2013).
Over the past decade, plant circRNA research has expanded substantially. For instance, overexpression of CircR5g05160 enhances rice resistance to fungal pathogens, while grape-derived Vv-circATS1 improves cold tolerance in Arabidopsis thaliana (Fan et al., 2020; Gao et al., 2019). Beyond their roles in environmental adaptation, circRNAs can exert sophisticated regulatory effects directly on their host loci. A prominent recent study demonstrated that circRNAs derived from miR156d can promote rice heading by repressing the transcription elongation of pri-miR156d through the formation of R-loops (Su et al., 2025). These findings highlight that circRNAs modulate plant development and stress responses through diverse and evolutionarily conserved mechanisms. Nevertheless, a comprehensive landscape of the cotton circRNAome—including its expression and interaction patterns under diverse abiotic stresses—remains largely elusive. To bridge this gap, we employed G. arboreum to systematically dissect circRNA biogenesis, stress specificity expression, and the intricate circRNA–miRNA regulatory circuitry. By integrating RNase R-treated RNA-seq with KNIFE-based algorithm identification pipelines, we generated detailed circRNA expression atlases under cold, salt, heat, and UV-B treatments. This work provides novel molecular modules and fundamental insights for improving multi-stress tolerance in cotton.
Materials and methods
2
Plant growth and treatments
2.1
The plant material utilized in this research was the diploid AA-genome Gossypium arboreum cv. Shixiya-1 (SXY-1). These plants were cultivated in an automated greenhouse maintained under the following controlled conditions: 28 °C, a 16 h light/8 h dark photoperiod, and 65% relative humidity. Two-week-old cotton seedlings were subjected to four abiotic stress treatments (Makarevitch et al., 2015). Stress conditions included heat (50 °C, 5 h) and cold (5 °C, 15 h) to trigger thermal-responsive regulators (Ni et al., 2021; Li et al., 2024), salt (irrigated with a 500 mM NaCl solution) to simulate severe salinity-induced ionic and osmotic stress (Zhang et al., 2017), and UV-B (1.24 μmol m^−2^ s^−1^, 2 h) to investigate rapid defense signaling (Song et al., 2024).
RNA extraction and circRNA sequencing
2.2
High-quality total RNA was used for circRNA library construction. To enrich for circular transcripts, total RNA was digested with RNase R to remove linear RNA. The resulting circRNA was purified using Agencourt RNA Clean XP magnetic beads. The purified circRNA was then fragmented and used as a template for cDNA synthesis. The cDNA fragments underwent end repair, “A”-tailing, and adaptor ligation, followed by final PCR amplification. Library quality was assessed using an Agilent 2100 Bioanalyzer for size distribution and quantified via qPCR. Qualified libraries were subsequently sequenced using PE100 high-throughput sequencing on the Illumina HiSeq 2500 platform.
CircRNA identification
2.3
Following acquisition of the sequencing data, quality control was performed using fastp v0.20.1 (Chen et al., 2018b). Subsequently, the clean reads were aligned to the G. arboreum SXY-1 genome (CRI assembly) (Du et al., 2018) with IGIA annotations (Wang et al., 2019a) using STAR v2.7.2b (Dobin et al., 2012). To identify chloroplast-derived circRNAs, we utilized the G. arboreum chloroplast genome sequence (NC_016712.1) as the reference (Wu et al., 2018).
This alignment facilitated the identification of all linear and back-splicing reads. circRNAs were identified in each sample using the KNIFE v1.4 software (https://github.com/lindaszabo/KNIFE), which also assigned a reliability score to each candidate. Finally, after excluding artifacts originating from ultra-long distance circularization, the bona fide circRNAs were retained based on stringent score cutoffs. Specifically, circRNAs with a score greater than 0.1 were retained from our RNase R-treated samples. For the non-stress control group, we utilized previously published high-quality RNA-seq datasets (Wang et al., 2019a), and retained circRNAs with a score greater than 0.9 to ensure high confidence. This integrated approach allowed for a robust comparison between RNase R-enriched libraries and standard transcriptomic data.
CircRNA validation
2.4
The existence of candidate circRNAs was validated by PCR using divergent primers specifically designed to amplify the back-splice junctions (BSJs). To compensate for the low abundance of circRNAs, PCR was performed for 40 cycles using an increased amount of cDNA template. The amplified product of the expected size was purified and subjected to Sanger sequencing. The resulting sequence was aligned to the circRNA sequence to confirm the precise location and presence of the BSJ.
Quantification and differential expression analysis of circRNAs
2.4.1
To quantify expression levels, the alignment results for all samples were sorted using SAMtools v1.9 (Li et al., 2009). Subsequently, StringTie v2.0.6 (Pertea et al., 2015) was employed to calculate the expression levels for all genes and circRNAs. Specifically, expression values from RNase R-treated samples represented circRNA levels, while those from control samples represented linear transcript levels. Differential expression analysis between samples was performed using the edgeR v3.32.1 package in R v4.0.2 (Robinson et al., 2009).
miRNA-related analysis and target prediction
2.4.2
To investigate the expression profiles of miRNAs under the applied abiotic stresses, small RNA sequencing (sRNA-seq) was performed on all collected samples. Following quality control and adapter trimming, the resulting sequencing reads were aligned to the G. arboreum genome. The identification and annotation of known and novel miRNAs were conducted using the sRNAanno annotation pipeline sRNAminer v1.0.0 (Chen et al., 2021). Based on these annotations, the expression levels of each miRNA in every sample were calculated. Potential miRNA-circRNA and miRNA-gene interactions were predicted using three independent computational tools: miRanda v3.3a (Enright et al., 2003), the TargetFinder software (Fahlgren and Carrington, 2010), and the psRNATarget online tool (Dai and Zhao, 2011). Only those interactions consistently identified and predicted by all three software platforms and met the established thermodynamic criteria were retained as highly reliable results for the construction of the circRNA-miRNA-mRNA regulatory network and downstream functional analysis.
Results
3
Identification of stress-responsive circRNAs in G. arboreum
3.1
We performed RNase R-treated RNA sequencing (circRNA-seq) on two-week-old G. arboreum seedlings subjected to four distinct abiotic stresses to profile their circular RNAs (circRNAs) (Supplementary Table 1). First, the sequencing quality was evaluated by analyzing the genomic distribution of clean reads, with the majority of clean reads aligning to genic regions (Supplementary Figure 1A). Furthermore, high correlation coefficients between biological replicates across all treatments, confirming the high reproducibility of our data and their suitability for reliable circRNA identification (Supplementary Figure 1B).
CircRNAs were predicted using the KNIFE pipeline, which assigned a reliability score to each candidate. The score distribution demonstrated that RNase R treatment effectively reduced low-scoring candidates and enriched those with high scores, thereby improving overall prediction fidelity (Figure 1A). Consistent with this, linear RNA abundance was markedly reduced post-digestion, validating the enzymatic enrichment (Figure 1B). However, false positives at BSJs remain an inherent challenge, as evidenced by the minimal overlap between digestion and non-digestion sets (Supplementary Figure 1C). To address this, we implemented a stringent filtering strategy to discriminate potential artifacts from bona fide circRNAs based on whether the BSJs are intragenic or intergenic. Analysis of the reliability score distribution showed that candidates with scores between 0–0.1 in RNase R-treated data, and 0–0.9 in control data, were predominantly supported by intergenic BSJs, indicating low credibility. Therefore, we applied dual thresholds: circRNAs with scores > 0.1 were retained from RNase R-treated samples, and those with scores > 0.9 from control samples. By integrating the results from both filtering pipelines, we identified 4,365 high-confidence unique circRNAs for downstream analysis (Supplementary Table 2). Notably, 2,591 of these candidates were supported by a single back-splice junction (BSJ) read in a single sample, indicating that the cotton circRNA landscape remains complex and has not yet reached a saturation plateau (Supplementary Table 3).
Identification of circRNAs in G. arboreum. (A) The numbers of detected distinct back-splicing events and their corresponding authenticity scores across different stress conditions. (B) Distribution of circular-to-linear ratios at BSJ sites detected in RNase R-treated samples versus controls. Statistical significance was determined using Fisher’s exact test. (C) Chromosomal distribution of identified circRNAs and their host genes across the 13 chromosomes. (D) Venn diagram illustrating the overlap of circRNAs identified across different conditions. (E) Read coverage and gene models for two representative circRNAs, Circ-Ga01g181 and Circ-Ga01g182. The scales were normalized by RPKM. The symbol *** indicates a statistically significant difference at the level of P < 0.001.
We further analyzed the chromosomal distribution of the identified circRNAs and their host genes. The majority were distributed across chromosomes 1 to 13, with only 175 mapped to unanchored scaffolds, supporting the notion that circRNA biogenesis is a genome-wide phenomenon in cotton (Figure 1C). Next, we examined their expression patterns under stress. Profiling revealed highly condition-specific expression: 1380, 423, 537, and 663 circRNAs were uniquely detected under cold, heat, salt, and UV-B conditions, respectively. By contrast, only 90 circRNAs were common to all four treatments (Figure 1D). Finally, to visually confirm the efficacy of our identification, we examined the read coverage at a representative locus (Figure 1E). In control samples, reads spanned the entire gene body, indicative of full-length linear transcripts. After RNase R treatment, these background reads were markedly depleted, whereas signals at the BSJs remained robust. This selective enrichment demonstrates that RNase R effectively removes linear RNA background, thereby ensuring the high fidelity of our circRNA identification.
Experimental validation of predicted circRNAs
3.2
To experimentally validate the identified circRNAs, we selected 33 high-confidence candidates based on their variable back-splicing read counts (Supplementary Table 3). Divergent primers targeting the BSJs were designed for each candidate. All candidates were successfully amplified by PCR, and subsequent Sanger sequencing unequivocally verified that the sequences matched the predicted junctions (Supplementary Figure 2).
This validation strategy successfully confirmed circRNAs across a spectrum of structural features and expression levels. For instance, Circ-Ga13g070, a single-exonic circRNA with high read support, was validated (Figure 2A). We also confirmed multi-exonic circRNAs, including Circ-Ga13g292 and Circ-Ga01g228, which represented high and very low read count candidates, respectively (Figures 2B, C). This comprehensive experimental confirmation underscores the sensitivity and reliability of our circRNA sequencing and annotation pipeline, affirming the genuine biological existence of these novel circRNAs in G. arboreum.
Experimental validation of identified circRNAs using RT-PCR and Sanger sequencing. (A–C) Genomic structures and PCR validation of three representative circRNAs: the single-exonic Circ-Ga13g070 (A), the multi-exonic Circ-Ga13g292 (B), and the multi-exonic Circ-Ga01g228 (C).
Characterization of circRNAs in G. arboreum
3.3
Genomic features and biogenesis propensities of cotton circRNAs
3.3.1
To systematically explore the molecular characteristics of cotton circRNAs, we began by analyzing their structures and lengths. The identified circRNAs were classified into three types: multi-exonic, single-exonic, and intronic formed by head-to-tail splicing of multiple exons, single exon and intron, respectively. Multi-exonic circRNAs were overwhelmingly dominant (3,151), followed by intronic (941) and single-exonic (273). Overall, circRNAs lengths peaked around 250 bp, with multi-exonic circRNAs being significantly longer than the other two types (Figure 3A).
Genomic features and evolutionary conservation of cotton circRNAs. (A) Violin plot showing the length distribution of circRNAs classified by their genomic origin. (B) Comparison of intron lengths among circRNA-flanking introns, circRNAs introns, and genome-wide introns. (C) Boxplot comparing the number of exons in circRNA-host genes versus non-host genes. (D) Bar chart showing the proportion of genes producing circRNAs, categorized by their total exon number. (E) Boxplot comparing the number of isoforms in circRNA-host genes versus non-host genes. (F) Bar chart showing the proportion of genes producing circRNAs, categorized by their isoform count. In (A–F), statistical significance was determined using Fisher’s exact test. (G) Venn diagram showing the overlap of circRNA host gene orthologs between G. arboreum and A. thaliana, O. sativa, and Z. mays. Significance of the overlap was measured by Hypergeometric test. (H) Schematic representation of conserved circularization events and splice sites in orthologous genes. The symbol *** indicates a statistically significant difference at the level of P < 0.001.
Next, we compared the lengths of introns flanking circRNA-forming exons (flanking introns), introns contained within circRNAs (circRNA-introns), and average genomic introns. Both circRNA-associated intron categories were significantly longer than genomic introns, strongly supporting the notion that longer introns facilitate circRNAs biogenesis (Figure 3B).
We further compared the genomic features of circRNA-host genes versus non-host genes. CircRNA-host genes possessed significantly more exons (average 12) than non-host genes (average 4), and exon number was positively correlated with circRNA production propensity (Figures 3C, D). Also, circRNA-host genes produced more mRNA isoforms (average 3) than non-host genes (average 1) (Figures 3E, F).
In summary, circRNA formation in cotton is non-random and preferentially occurs in genes with longer introns and more exons—features intrinsically linked to a higher capacity for complex alternative splicing.
Evolutionary Conservation and the Mechanism of Ring Formation
3.3.2
Although canonical models posit that circRNA biogenesis is driven by inverted complementary sequences (ICSs) or repetitive sequences in flanking introns, which promote proximity via base-pairing, our analysis in cotton revealed a distinct pattern (Xiao et al., 2020). We found no significant difference in the length or complementarity of ICSs flanking circRNAs compared to those flanking randomly selected linear exon (Supplementary Figure 3A). Moreover, repetitive sequence abundance was even lower in circRNA-flanking introns than in random intronic sequences (Supplementary Figures 3B, C). These results revealed that cotton circRNAs biogenesis was largely independent of these canonical cis-regulatory features, suggesting a greater potential reliance on other mechanisms.
To evaluate evolutionary conservation, we performed a comparative analysis using circRNA-seq data from Arabidopsis thaliana (Ye et al., 2015), Oryza sativa (Wang et al., 2019b), and Zea mays (Chen et al., 2018a). The significant overlap of circRNA-producing orthologs confirms the conservation of circRNA biogenesis potential at the gene level (Figure 3G). We then investigated whether the precise BSJs themselves were conserved. By comparing circRNAs from orthologous genes, we classified them into three categories based on splice site conservation: “Accordant”, “Half-accordant”, and “Unaccordant”. The “Unaccordant” category predominated in all cross-species comparisons, indicating that specific circularization sites are highly dynamic. (Supplementary Figure 3D). Despite this widespread divergence, we identified several instances of positional conservation. This is exemplified by the GOX1 and RH8 orthologs, which produce circRNAs from identical genomic coordinates across the compared species (Figure 3H). Thus, while the propensity of a gene to produce circRNAs is conserved, the specific circularization sites are highly dynamic across species.
Analysis of circRNA and host gene expression under abiotic stress
3.4
To understand how abiotic stress reshapes the circRNA landscape in G. arboreum, we compared the expression profiles of all identified circRNAs and their host genes across all samples. Heatmap visualization revealed distinct circRNA expression patterns, and differential expression analysis further quantified these stress-specific responses (Figures 4A, B). Notably, cold and UV-B stress provoked the strongest responses, characterized by a predominant upregulation (306 and 290 DE circRNAs, respectively). Salt stress, however, triggered a net downregulation (141 DE circRNAs), while heat stress elicited the weakest response (58 DE circRNAs). These results demonstrate that distinct abiotic stressors modulate the circRNA transcriptome in specific ways.
Expression patterns and correlation analysis of circRNAs and their host genes under abiotic stresses. (A) Hierarchical clustering heatmap showing the expression profiles of detected circRNAs across five conditions. (B) Heatmap displaying the expression levels of the corresponding host genes across the same samples. (C) Density plot illustrating the distribution of Pearson correlation coefficients calculated between the expression patterns of circRNAs and their cognate host genes. (D) Scatter plot comparing the expression fold changes (FC) of circRNAs and their host genes between control and conditions. The marginal density plots show the distribution of fold changes for each axis.
Beyond stress-specific expression, a key finding was the frequent decoupling of circRNA abundance from host gene transcription. Although a subset of circRNAs showed high expression correlation with their host genes (peak ~0.8), the overall median Pearson correlation coefficient was low (0.34) (Figure 4C). This regulatory independence was further underscored by scatter plot, which revealed that many circRNAs were differentially expressed even when their host gene transcripts remained stable (Figure 4D). Collectively, these data demonstrate that the accumulation of a substantial fraction of cotton circRNAs is governed by independent stress-specific regulatory mechanisms, rather than being passive byproducts of host gene transcription.
The potential regulatory role of circRNAs via the miRNA sponge mechanism in G. arboreum
3.5
CircRNAs have been widely reported to act as competitive endogenous RNAs, or miRNA sponges, sequestering miRNAs to modulate their target mRNAs (Hansen et al., 2013), which plays a pivotal role in fine-tuning plant stress responses (Yin et al., 2024). We therefore investigated this potential in G. arboreum. Bioinformatic predictions identified 656 circRNAs with potential miRNA-binding sites and 6,939 miRNA target mRNAs (Supplementary Tables 4–5). Then we performed small RNA sequencing (sRNA-seq) on samples subjected to cold, heat, salt, and UV-B stress. Integrated analysis with these data revealed a robust network: key miRNAs each targeted multiple circRNAs (Figure 5A) and numerous differentially expressed (DE) genes (Figure 5B).
The circRNA-miRNA-mRNA network analysis. (A) Number of predicted target circRNAs for the top 10 miRNAs under each abiotic stress condition. (B) Number of predicted target differentially expressed genes for the top 10 miRNAs under each abiotic stress condition. (C) Expression patterns of key hub miRNAs. The heatmap shows the fold change in reads per million (RPM) of selected miRNAs under stress versus the control. The adjacent bar plot displays their basal expression levels in the CK condition. (D) Classification of DE circRNAs based on their expression patterns with their host genes. “Reversed pattern” indicates circRNAs acting as “mRNA up/circRNA down” or vice versa, while “Coordinated pattern” suggests potential co-expression. (E) A circRNA-miRNA-mRNA interaction network centered on miR8633 under cold stress. Lines represent predicted interactions. Node color and intensity represent the expression fold change of the corresponding RNA compared to CK.
Notably, the majority of these key hub miRNAs are cotton-specific, such as miR8633, miR8646, and miR8634, and strongly stress-responsive (Figure 5C), implying their evolved specializations in lineage-specific stress adaptation. Classifying DE circRNAs by their expression relative to host genes revealed a notable prevalence of “reversed” over “coordinated” patterns (Figure 5D), supporting that unlike canonical cleavage-based mRNA-miRNA interaction, these circRNAs may engage with miRNAs primarily through other modes. As a representative case, the miR8633 network under cold stress showed upregulation of all targeting circRNAs but downregulation of most targeted mRNAs (Figure 5E). This pattern is consistent with a model where circRNAs sponge miR8633, thereby alleviating its repression of target genes such as Ga11g02685—a putative homolog of PRR5, a negative regulator of cold tolerance in Arabidopsis (Guan et al., 2013). Similar miR8633-centered patterns under other stresses suggest this may be a recurrent regulatory module in Gossypium stress adaptation.
High abundance and locus specificity of chloroplast circRNAs in cotton
3.6
Current research on plant circRNAs has primarily focused on those derived from nuclear genes via canonical back-splicing (Chen, 2020; Xiao et al., 2020). In contrast, our analysis in cotton revealed a remarkable enrichment of chloroplast-derived circRNAs, which constituted over 80% of the total circRNA pool (Figures 6A, B). This disproportionate abundance establishes the chloroplast as a hotspot of circRNA biogenesis and accumulation in cotton.
Characterization of chloroplast-derived circRNAs. (A) Comparison of the percentage of reads mapping to the chloroplast genome in circRNA-seq and RNA-seq libraries. (B) Pie charts illustrating the genomic distribution of sequenced reads. The left chart shows the proportion of reads mapping to the chloroplast genome versus the nuclear/mitochondrial genomes in the RNase R-treated sample, compared to the untreated control sample. (C) Genomic visualization of read coverage and BSJs at the chloroplast psbA locus. Arrows indicate the BSJ reads, verifying the circular structure at this locus.
Mapping circRNA-seq reads across the chloroplast genome revealed a striking pattern: they were not randomly distributed but exclusively and specifically concentrated at the 3’ terminus of the core photosynthetic gene psbA (Supplementary Table 6). Furthermore, we detected multiple back-splicing isoforms from this locus, indicating complex alternative circularization (Figure 6C).
The psbA gene encodes the D1 protein, an essential component of the Photosystem II (PSII) reaction center. Due to its central role and susceptibility to photodamage, psbA expression is highly regulated (Nixon et al., 2010). The exceptional abundance and precise localization circRNAs at its 3’ terminus suggest a potential role in PSII homeostasis. We hypothesize that these chloroplast circRNAs may participate in the regulation of PSII repair and turnover, potentially by modulating psbA mRNA metabolism. This discovery unveils a novel, circRNA-associated layer of potential post-transcriptional regulation within the cotton chloroplast.
Discussion
4
Establishment of a high-confidence CircRNA landscape in G. arboreum
4.1
In this study, we established a high-resolution and high-confidence circRNA atlas for G. arboreum under multiple abiotic stresses. A significant challenge in plant circRNA research has been the high false-positive rate due to low abundance and interference from linear RNA degradation. We addressed this by employing an RNase R-enhanced RNA-seq strategy combined with a KNIFE-based algorithm pipeline. Our results demonstrate that RNase R treatment effectively depleted linear transcripts while enriching BSJs, significantly increasing the “Circ Fraction” across all stress treatments. This approach not only increased the number of identified circRNAs but also ensured high fidelity, as evidenced by our successful PCR and Sanger sequencing validation of 33 candidates. This comprehensive dataset significantly expands the known repertoire and understanding of circRNA in cotton.
Unique biogenesis mechanisms and evolutionary dynamics
4.2
The biogenesis of circRNAs in G. arboreum appears to diverge from the classical animal model. In animals, circRNA formation is primarily driven by Inverted Complementary Sequences (ICSs) in flanking introns (Zhang et al., 2014). However, our analysis revealed no significant enrichment of ICSs in cotton circRNAs. Instead, consistent with findings in other plants like Oryza sativa and Arabidopsis thaliana (Ye et al., 2015), we found that circRNA production is highly correlated with longer flanking introns, suggesting that plant-specific splicing factors or chromatin states might play a more dominant role. While the majority of identified circRNAs are derived from exonic back-splicing, the discovery of a significant portion of intronic circRNAs under abiotic stress suggests that stress-induced alternative splicing may trigger the circularization of non-canonical genomic regions, potentially serving as a rapid regulatory response to environmental fluctuations.
Our analysis revealed a pronounced decoupling between circRNA abundance and host gene expression, evidenced by a low median Pearson correlation coefficient of 0.34. This supports the “regulatory independence” model, suggesting that for the majority of the cotton circRNAome, accumulation is governed by autonomous biogenesis or stability control mechanisms rather than being a mere byproduct of host gene transcription. Such independence is an evolutionarily conserved strategy, aligning with large-scale transcriptomic observations in other major crops like rice and maize (Fan et al., 2025; Tang et al., 2018). Beyond this decoupling, we specifically identified “reversed patterns” where certain circRNAs are negatively correlated with the expression of their host genes. Such negative correlations (termed “reversed patterns” in our study) are common in both animals and plants, including poplar hybrids and maize (Abdelmohsen et al., 2017; Zhang et al., 2023; 2019). This aligns with the “competitive splicing” model, where circRNA biogenesis limits linear mRNA production by competing for splicing machinery or regulatory proteins (Wang et al., 2022). These findings reinforce the notion that circRNAs have evolved into independent functional entities, providing an additional layer of post-transcriptional control that contributes to the environmental resilience of cotton.
The dominance of organelle-derived circRNAs and psbA regulation
4.3
A notable finding of this study is the high abundance of chloroplast-derived circRNAs, which accounted for over 80% of total circRNA reads following RNase R treatment. Unlike their nuclear counterparts, these organellar molecules are specifically clustered at the 3’ terminus of the psbA gene. While nuclear circRNAs in cotton are associated with long flanking introns, chloroplast circRNAs likely stem from the high transcriptional flux of psbA and its unique 3’ terminus processing. Furthermore, the lack of 5’ caps and 3’ poly(A) tails in the chloroplast may inherently favor the formation and structural stability of covalently closed loops, protecting them from exonuclease-mediated degradation. As the structural and functional core of the Photosystem II (PSII) reaction center, psbA is indispensable for fine-tuning primary charge separation and ensuring photosynthetic resilience under fluctuating environmental conditions (Bhattacharjee et al., 2025). Given that its encoded D1 protein is highly susceptible to photoinhibition and requires rapid turnover (Aro et al., 1993), the striking enrichment of circRNAs at this pivotal locus suggests a specialized layer of post-transcriptional regulation. As psbA encodes the D1 protein, these circRNAs may serve as a conserved regulatory buffer to fine-tune photosynthetic machinery under stress. While the field has largely focused on nuclear-derived circRNAs (Li et al., 2018; Zhang et al., 2020), our results highlight the chloroplast—and specifically the psbA locus—as a critical yet underexplored hotspot for circRNA-mediated stress responses in cotton. The dominance of psbA-derived circRNAs in cotton likely arises from a synergy between high template availability and site-specific biogenesis. The preferential enrichment of 3’ terminus-derived reads following RNase R treatment (Figure 6C) supports an active circularization model, consistent with findings in Arabidopsis and other dicots where organelle circRNAs act as dynamic regulatory modules (Liu et al., 2019; Philips et al., 2020).
circRNAs in cross-stress tolerance
4.4
Our results demonstrate that a subset of circRNAs responds commonly to drought, cold, heat, and UV-B, suggesting their role in “cross-talk” pathways. One plausible mechanism, similar to that identified in cotton CMS-D2 restorer lines, is that they act as competitive endogenous RNAs (Wang et al., 2024). For instance, the enrichment of circRNAs originating from key stress-related loci identified in this study may allow them to modulate the plant’s core adaptive machinery by regulating their host transcripts or interacting with downstream signaling components. This aligns with the “rising star” role of plant circRNAs in fine-tuning cellular homeostasis under environmental pressure (Zhang and Dai, 2022).
Although their covalently closed loops generally provide high structural stability against exonucleases, their steady-state abundance represents a dynamic equilibrium between biogenesis and clearance (Chen, 2020; Xiao et al., 2020). In our study, the frequent decoupling of circRNA abundance from host gene levels (median r = 0.34) suggests that this equilibrium is independently and tightly controlled under abiotic stress. Even if some circRNAs undergo relatively rapid turnover, they may function as transient molecular switches or decoys to facilitate swift physiological adjustments to environmental fluctuations.
Perspectives and future directions
4.5
Despite the high-confidence atlas established here, several challenges remain. First, our identification relied on short-read sequencing, which has inherent limitations in fully resolving complex internal splicing patterns and full-length circRNA isoforms. To definitively establish the in vivo roles of specific candidates like psbA-derived circRNAs, future studies should integrate these emerging platforms with CRISPR/Cas9-based functional genomics, building upon the successful precedents established in rice (Zhou et al., 2021). Deciphering whether these molecules act as miRNA sponges or protein scaffolds will provide new avenues for molecular breeding of resilient cotton varieties in the face of global climate change. Deciphering whether these molecules act as miRNA sponges or protein scaffolds will provide new avenues for molecular breeding of resilient cotton varieties in the face of global climate change.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Abdelmohsen K. Panda A. C. Munk R. Grammatikakis I. Dudekula D. B. De S. . (2017). Identification of Hu R target circular RN As uncovers suppression of PABPN 1 translation by Circ PABPN 1. RNA Biol. 14, 361–369. doi: 10.1080/15476286.2017.1279788, PMID: 28080204 PMC 5367248 · doi ↗ · pubmed ↗
- 2Aro E.-M. Virgin I. Andersson B. (1993). Photoinhibition of Photosystem II. Inactivation, protein damage and turnover. Biochim. Biophys. Acta (BBA) - Bioenerg. 1143, 113–134. doi: 10.1016/0005-2728(93)90134-2, PMID: 8318516 · doi ↗ · pubmed ↗
- 3Bhattacharjee S. Gordiy I. Sirohiwal A. Pantazis D. A. (2025). Microscopic basis of reaction center modulation in Psb A variants of photosystem II. Proc. Natl. Acad. Sci. 122, e 2417963122. doi: 10.1073/pnas.2417963122, PMID: 40354529 PMC 12107152 · doi ↗ · pubmed ↗
- 4Chen L.-L. (2016). The biogenesis and emerging roles of circular RN As. Nat. Rev. Mol. Cell Biol. 17, 205–211. doi: 10.1038/nrm.2015.32, PMID: 26908011 · doi ↗ · pubmed ↗
- 5Chen L.-L. (2020). The expanding regulatory mechanisms and cellular functions of circular RN As. Nat. Rev. Mol. Cell Biol. 21, 475–490. doi: 10.1038/s 41580-020-0243-y, PMID: 32366901 · doi ↗ · pubmed ↗
- 6Chen C. Li J. Feng J. Liu B. Feng L. Yu X. . (2021). s RN Aanno—a database repository of uniformly annotated small RN As in plants. Hortic. Res. 8, 45. doi: 10.1038/s 41438-021-00480-8, PMID: 33642576 PMC 7917102 · doi ↗ · pubmed ↗
- 7Chen L. Zhang P. Fan Y. Lu Q. Li Q. Yan J. . (2018 a). Circular RN As mediated by transposons are associated with transcriptomic and phenotypic variation in maize. New Phytol. 217, 1292–1306. doi: 10.1111/nph.14901, PMID: 29155438 · doi ↗ · pubmed ↗
- 8Chen S. Zhou Y. Chen Y. Gu J. (2018 b). fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i 884–i 890. doi: 10.1093/bioinformatics/bty 560, PMID: 30423086 PMC 6129281 · doi ↗ · pubmed ↗
