Polyploidization-Driven Functional Innovation of AGPase Small Subunit Gene APS1 Regulates Starch Biosynthesis in Banana (Musa acuminata)

Junmei Sun; Zhao Zhu; Peiguang Sun; Yunen Tu; Xiaowan Hou; Muhammad Moaaz Ali; Yueruxin Jin; Min Zhang; Dongyi Huang; Xiqiang Song; Juhua Liu; Zhiqiang Jin; Hongxia Miao

PMC · DOI:10.3390/ijms27041821·February 14, 2026

Polyploidization-Driven Functional Innovation of AGPase Small Subunit Gene APS1 Regulates Starch Biosynthesis in Banana (Musa acuminata)

Junmei Sun, Zhao Zhu, Peiguang Sun, Yunen Tu, Xiaowan Hou, Muhammad Moaaz Ali, Yueruxin Jin, Min Zhang, Dongyi Huang, Xiqiang Song, Juhua Liu, Zhiqiang Jin, Hongxia Miao

PDF

Open Access

TL;DR

This study explores how gene duplication and evolution have led to a key gene, MaAPS1, regulating starch production in bananas, which affects fruit quality and yield.

Contribution

The study identifies MaAPS1 as a functionally differentiated gene resulting from polyploidization, with a role in starch biosynthesis in banana.

Findings

01

MaAPS1 shows increased expression and structural features linked to starch accumulation in banana fruit.

02

Functional validation shows that MaAPS1 silencing reduces starch content, while overexpression increases it.

03

Transcription factors like ERF1 and bZIP1 are suggested to regulate MaAPS1 through promoter interactions.

Abstract

Starch biosynthesis is a fundamental process influencing yield and fruit quality in banana, with ADP-glucose pyrophosphorylase (AGPase) serving as the rate-limiting enzyme catalyzing sucrose conversion into starch. However, the mechanisms underlying functional differentiation of AGPase family genes following polyploidization remain largely unexplored. In this study, eight AGPase genes, including large (MaAPL) and small subunit (MaAPS) members, were identified from the banana (Musa acuminata) genome, all harboring the conserved ADP-glucose pyrophosphorylase domain. Phylogenetic analysis traced their evolutionary origin to the ancient moss Physcomitrella patens, with polyploidization identified as the primary driver of gene family expansion. These genes exhibit conserved codon usage bias and have undergone strong purifying selection. Among them, MaAPS1 displayed distinct functional…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Genes9

ZFP36L1 zfp36.L bZIP1 bZIP3 UBQ2 Granule-bound starch synthase SSI Actin SSIII

Proteins6

Species33

Musa acuminata Panicum hallii(species)Vitis vinifera(wine grape · species)Vigna radiata(mung bean · species)Arabidopsis halleri(species)Pyrus x bretschneideri(bai li · species)Physcomitrium patens(species)Pyrus communis(pear · species)Chenopodium quinoa(quinoa · species)Vigna unguiculata(cowpea · species)Marchantia polymorpha(common liverwort · species)Glycine max(soybean · species)Malus domestica(apple · species)Prunus persica(peach · species)Lathyrus oleraceus(garden pea · species)Prunus(genus)S. bicolor Triticum aestivum(bread wheat · species)Agrobacterium tumefaciens(species)Sorghum bicolor(broomcorn · species)Medicago truncatula(barrel medic · species)Solanum tuberosum(potatoes · species)Brachypodium distachyon(annual false brome · species)Oryza sativa(Asian cultivated rice · species)Setaria italica(foxtail millet · species)Nicotiana benthamiana(species)Zea mays(maize · species)Homo sapiens(human · species)Ipomoea batatas(batate · species)Solanum lycopersicum(tomato · species)Manihot esculenta(cassava · species)Phaseolus vulgaris(common bean · species)Arabidopsis thaliana(mouse-ear cress · species)

Cell lines1

LBA4404— Homo sapiens (Human) · Transformed cell line

Chemicals15

iodine-potassium iodide TRIzol polysaccharide Amylopectin Glc-1-P amylose sugars glucose ATP ADP-Glc Iodine sucrose carbohydrate Starch I2-KI

Diseases1

injury to

Figures6

Click any figure to enlarge with its caption.

Funding7

—the National Natural Science Foundation of China
—the project of National Key Laboratory for Tropical Crop Breeding
—the Hainan Provincial Natural Science Foundation of China
—the Central Public-interest Scientific Institution Basal Research Fund for Innovative Research Team Program of CATAS
—the project of State Key Laboratory of Tropical Crop Breeding
—the Project of State Key Laboratory of Tropical Crop Breeding
—the Modern Agro-industry Technology Research System of China

Keywords

fruit starch synthesisregulatory divergencepolyploidizationMaAPS1transient transformation

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFood composition and properties · Plant nutrient uptake and metabolism · Phytase and its Applications

Full text

1. Introduction

Starch, the most abundant storage polysaccharide in plants, plays a crucial role in energy metabolism and is extensively accumulated in various tissues. Numerous studies have demonstrated that storage organs such as roots of cassava (Manihot esculenta Crantz), tubers of potato (Solanum tuberosum L.), and cereal seeds [1,2] are particularly rich in starch. Notably, unripe fruits of banana (Musa acuminata Colla) contain over 60% starch by dry weight [3,4,5]. As one of the most widely consumed fresh fruits globally, the banana also serves as a staple food in many regions, particularly in parts of Africa, where it provides a significant source of dietary energy [6,7]. These characteristics make banana a suitable system for studying starch metabolic pathways in non-cereal crops.

Starch biosynthesis is a complex, finely regulated physiological process involving the coordinated actions of several key enzymes. These enzymes function both independently and cooperatively to synthesize the two major components of starch: amylopectin and amylose. ADP-glucose pyrophosphorylase (AGPase) catalyzes the formation of ADP-glucose (ADP-Glc) from glucose-1-phosphate (Glc-1-P) and ATP, which is the committed and rate-limiting step in starch synthesis in plants [8]. Granule-bound starch synthase (GBSS) then incorporates glucose units from ADP-Glc into the elongating amylose chain. Amylopectin synthesis involves a more coordinated enzyme system, including starch branching enzyme (SBE), soluble starch synthase (SS), and starch debranching enzyme (DBE), in addition to AGPase [8,9].

In monocotyledonous plants, AGPase exists as a heterotetrameric complex composed of two large subunits (AGPase large subunit, APL) and two small subunits (AGPase small subunit, APS), with isoforms localized in both the cytosol and plastids [1,10,11]. The APLs serve primarily regulatory functions, sensing metabolic and environmental signals to modulate AGPase activity, although they possess limited catalytic capability [12]. Conversely, the APSs are more conserved and contribute to both catalytic and regulatory functions, constituting the core enzymatic activity of AGPase [13,14]. AGPase plays an important role in plant growth, development, and starch accumulation. In rice (Oryza sativa), mutation of the OsAGPase gene significantly reduces the AGPase activity, impairing starch deposition in the endosperm and negatively affecting grain quality [15]. In Arabidopsis thaliana, T-DNA insertion mutants of the APS1 gene exhibit severely reduced of AGPase activity, leading to reduced starch synthesis and growth defects [13]. Similarly, AGPase activity is positively correlated with starch content and grain filling in cereals [16,17]. Introduction of a modified Zea mays AGPase gene into wheat (Triticum aestivum) has been shown to increase grain weight and starch accumulation [18], whereas knockdown of AGPase results in reduced seed starch and grain weight [19]. In fruits such as peach (Prunus persica), overexpression of APL1 enhances starch and soluble solid accumulation during ripening [20], underscoring the enzyme’s role beyond seeds and grains. In addition, transcription factors (TFs), including MYBs, RESR, and bZIPs, have been reported to participate in the regulation the expression of starch biosynthesis enzyme genes [21,22,23]. OsMYBR1 directly binds to the promoters of OsAPL1 and OsAPL2, resulting in the increase of amylopectin content and the improvement of rice eating and cooking quality [22]. OsbZIP10 mediates the expression of target gene OsAPS1 to contribute to grain filling [21]. However, whether MYBs, bZIPs and other IFs are involved in regulating the expression of banana AGPase genes and thereby affect fruit starch synthesis has not been clearly established.

The AGPase gene family has been implicated in the development of banana fruit [3,24], but the evolutionary origin, gene family expansion, and functional divergence of MaAGPase genes remains insufficiently characterized. Polyploidization, a major evolutionary force in plants, often drives gene duplication and functional diversification. However, the contribution of polyploidy to AGPase gene diversification in banana has not been systematically evaluated. To address these knowledge gaps, we performed a comprehensive analysis of AGPase genes across 31 plant species, including model organisms such as O. sativa and A. thaliana, as well as agriculturally important crops like Solanum lycopersicum, Chenopodium quinoa, and M. acuminata. Many of these species have undergone independent whole-genome duplication (WGD) events, providing a comparative framework for investigating the evolutionary and functional diversification of AGPase. Using comparative genomic approaches—including synteny analysis, evolutionary rate estimation, and codon usage bias—we examined the evolutionary patterns and functional conservation of AGPase genes. Our findings revealed widespread functional conservation in species such as S. tuberosum, Sorghum bicolor, O. sativa, and Panicum hallii. Notably, the M. acuminata gene MaAPS1 exhibited evidence of motif expansion following polyploidization and distinct expression characteristics. Additionally, molecular docking predicted potential transcriptional regulators of this gene. Together, these results provide a foundation for understanding the function and regulation of MaAPS1 in banana fruit starch metabolism.

2. Results

2.1. Identification, Phylogenetic Analysis, Evolution, and Expansion of the AGPase Gene Family in Representative Plants

In this study, we systematically identified and analyzed the AGPase gene family across 31 plant species, including a diverse array of economically and evolutionarily significant monocots and dicots. Among the monocots examined were wheat (T. aestivum), Brachypodium distachyon, maize (Z. mays), rice (O. sativa), sorghum (S. bicolor), and banana (M. acuminata). Dicots included A. thaliana, Arabidopsis halleri, grape (Vitis vinifera), potato (S. tuberosum), tomato (S. lycopersicum), and soybean (Glycine max), alongside related legumes such as Vigna radiata, Vigna unguiculata, Phaseolus vulgaris, and Pisum sativum. We also included the early-diverging moss Marchantia polymorpha, which holds an important phylogenetic position among extant land plants.

The species tree based on AGPase orthogroups illustrates the evolutionary relationships among these abovementioned species (Figure 1A), while the number of AGPase proteins identified per species is shown in Figure 1B and Table S1. Orthologous gene inference allowed us to identify AGPase family members in each plant species and categorize them into 10 orthogroups. The distribution of these AGPase proteins across orthogroups is detailed in Figure 1C and Table S2. Our analysis indicates that the AGPase gene family likely originated from ancestral lineages represented by M. polymorpha. As plant lineages diversified, the AGPase gene underwent duplication and functional divergence, giving rise to a gene family with variable copy numbers across plant taxa.

Among the orthogroups, OG0 stood out due to its relatively high gene copy number in many species, in contrast to the low copy numbers or gene loss observed in other orthogroups. The relative number of genes assigned to orthogroups versus unassigned genes across species is illustrated in Figure 1D. Phylogenetic analysis based on protein sequence alignments revealed a more complex topology for OG0, consistent with extensive diversification.

Conserved motif analysis further revealed variation across orthogroups. OG0 primarily contained Motifs 1, 3, and 4, whereas OG1 was characterized by Motifs 4, 5, and 9 (Table S3). Evolutionary expansion of motifs was observed in higher plants; for example, while M. polymorpha only contained Motifs 1–9, Motif 10 appeared in species such as banana (M. acuminata), apple (Malus domestica), and pear (Pyrus bretschneideri), suggesting that it arose later during plant evolution (Figure S1). The circular phylogenetic tree combined with motif architecture clearly illustrates this motif diversification (Figure 1E).

2.2. Polyploidization as a Major Driver of AGPase Gene Family Expansion

To investigate the mechanisms underlying AGPase gene family expansion, we examined the genomic distribution, duplication patterns, and evolutionary history of AGPase genes across different plant species. A circular bar chart showing the total number of AGPase genes and their distribution across species highlighted considerable variability, with some lineages (e.g., T. aestivum, M. acuminata) possessing markedly higher gene counts (Figure 2A).

Given that WGD is a primary driver of gene family expansion in plants, we classified AGPase duplication events into three categories—WGD/segmental, dispersed, and proximal—using MCScanX (Table S4). Most species showed a dominant contribution of WGD or segmental duplications (blue bars), accounting for over 60% of total AGPase duplicates in the majority of lineages (Figure 2B). Notably, in Pyrus communis, proximal duplication also contributed significantly, while species such as S. tuberosum, S. viridis, O. sativa, and M. acuminata exhibited dual enrichment in WGD and dispersed duplication events. This pattern suggests that AGPase gene expansion in these species may have been shaped by both large-scale polyploid events and smaller-scale duplications.

In M. acuminata, more than 70% of AGPase gene copies were attributed to WGD or segmental duplication, supporting a prominent role of ancient polyploidization in banana genome evolution. To assess the evolutionary timing of duplication events, we conducted synonymous substitution rate (K_s_) analysis. Density plots of K_s_ values for AGPase gene pairs across species showed distinct peaks corresponding to past WGD events (Figure 2C). For instance, Z. mays, O. sativa, and P. hallii exhibited clear K_s_ peaks, consistent with known genome duplication histories. In contrast, some species, such as Medicago truncatula and V. vinifera, showed flatter K_s_ distributions, indicating older or more diffuse duplication events. Together, the duplication class distribution and K_s_ profiles support a major distribution of polyploidy to the expansion of the AGPase family across angiosperms.

2.3. Conservation of Synteny and Codon Usage Bias in AGPase Genes Across Species

To investigate the evolutionary relationships and functional conservation of AGPase genes across species, we performed a comparative genomic analysis between banana (M. acuminata) and 16 other plant species (Figure 3A). This analysis revealed conserved syntenic relationships between AGPase genes in banana and those in several species, with the highest numbers of orthologous gene pairs found in Setaria italica, S. bicolor, and Panicum hallii (Table S5). These findings suggest a closer evolutionary relationship of AGPase genes between banana and these grass species.

We further calculated the nonsynonymous to synonymous substitution rate ratios (K_a_/K_s_) of AGPase genes to assess the selective pressures acting upon them. In most species, K_a_/K_s_ ratios were below 1, indicating that purifying selection has played a dominant role in maintaining the functional integrity of AGPase genes. Figure 3B illustrates that only a few species exhibited K_a_/K_s_ ratios greater than 1, suggesting potential episodes of positive selection.

To evaluate codon usage patterns of AGPase genes, we performed a codon usage bias analysis. The heatmap of Relative Synonymous Codon Usage (RSCU) values (Figure 3C) showed moderate variation in codon preference among species, with most codons displaying RSCU values near or below 1. Additionally, violin plots of codon usage indices (Figure 3D) revealed that the codon adaptation index (CAI) was less than 0.5, the codon bias index (CBI) was near zero, the frequency of optimal codons (FOP) was below 0.5, and the effective number of codons (ENC) exceeded 40. Collectively, these metrics indicate weak codon usage bias in AGPase genes across species, consistent with conserved and functional constraint.

2.4. Structural Characteristics and Spatiotemporal Expression Patterns of AGPase Genes in Banana

An in-depth analysis of the AGPase family in M. acuminata revealed notable variations in motif composition and gene structure. As shown in Figure 4A, sequence analysis identified that Ma06_t28940.2, Ma06_t28940.1, Ma09_t06650.1, Ma04_t02930.1, Ma04_t10600.1, and Ma01_t05380.1 harbor all ten conserved motifs (Motif1–Motif10). In contrast, Ma01_t00130.1 (MaAPS1) lacks Motif10, while Ma03_t22640.1 is missing both Motif10 and Motif7, suggesting potential structural dive rgence among family members.

Further structural analysis indicated that the same six genes also possess a higher number of coding sequences (CDSs) compared to MaAPS1 and Ma03_t22640.1. The greater number of CDSs may reflect more complex gene structure. Gene duplication analysis revealed that Ma06_t28940.1, Ma09_t06650.1, Ma04_t02930.1, and Ma04_t10600.1 originated from whole-genome duplication (WGD) events, whereas Ma01_t05380.1, MaAPS1, and Ma03_t22640.1 resulted from dispersed duplication events.

Synteny analysis revealed strong collinearity between Ma06_t28940.2 and three other genes, Ma04_t02930.1, Ma06_t28940.1, and Ma09_t06650.1, providing insights into their evolutionary history. These syntenic relationships are illustrated by the connecting arcs in Figure 4A. Promoter analysis revealed that Ma04_t02930.1 and MaAPS1 contain a high density of cis-acting regulatory elements, with the promoter of Ma04_t02930.1 exhibiting greater diversity in element types (Figure 4B).

Spatiotemporal expression profiling demonstrated that Ma04_t02930.1, Ma06_t28940.1, Ma09_t06650.1, and MaAPS1 are predominantly expressed in fruit tissue, with expression levels increasing from 0 days after the emergence of the inflorescence from the pseudostem (DAF) to 0 days post-harvest (DPH) (Figure 4C and Table S6). Notably, Ma04_t02930.1 and MaAPS1 exhibited the highest expression levels. The dynamic expression patterns across banana fruit developmental stages are further illustrated by the heatmap in Figure 4D and Table S7.

2.5. Co-Localization of MaAPS1 and Its Function in Starch Synthesis of Banana Fruit

The open reading frame (ORF) of MaAPS1 was fused to green fluorescent protein (GFP) and transiently expressed in Nicotiana benthamiana leaf cells. Co-localization with a chloroplast red fluorescent protein (RFP) marker showed that the MaAPS1-GFP fusion protein was predominately co-localized in chloroplasts, whereas the GFP control exhibited diffuse cellular distribution (Figure 5A).

Transient silencing of MaAPS1 in banana fruit discs resulted in visibly darker iodine–potassium iodide (I_2_-KI) staining compared to the control (Figure 5B), indicating reduced starch accumulation. Expression analysis confirmed significant downregulation of MaAPS1 in silenced tissues (Figure 5C). Quantification of carbohydrate components showed that total starch, amylopectin, and amylose contents were reduced by 14.65%, 9.59%, and 5.07%, respectively, compared to the empty vector control (Figure 5D).

In contrast, transient overexpression of MaAPS1 resulted in altered I_2_-KI staining intensity (Figure 5E). MaAPS1 transcript levels were significantly increased in overexpressing tissues (Figure 5F). However, total starch, amylopectin, and amylose contents did not show a consistent increase, instead exhibiting changes of approximately 14%, 8.53%, and 5.39%, respectively (Figure 5G). These findings confirm that modulation of MaAPS1 expression affects starch metabolism, rather than uniformly enhancing starch accumulation.

2.6. Prediction of Upstream Transcriptional Regulators of Banana MaAPS1

To explore potential transcriptional regulators of MaAPS1, we conducted in silico prediction of TFs interacting with its promoter region. The predicted regulatory network is shown in Figure 6A and Table S8. Expression profiling revealed that ERF1, C3H1, bZIP1, and bZIP3 exhibited elevated transcript levels during early stages of banana fruit development (Figure 6B and Table S9). Binding potential scores derived from computational prediction indicated possible interactions between these IFs and the MaAPS1, with values of 17.4308 for ERF1, 14.0154 for C3H1, 9.6154 for bZIP1, and 13.2857 for bZIP3 (Figure 6C).

To validate these predicted interactions, molecular docking simulations were performed to assess the binding affinity of each TF to specific cis-regulatory sequences within the MaAPS1 promoter. The analysis identified distinct binding motifs: ERF1 was predicted to bind the sequence bZIP1 to TAGGATCACGTGGGA (Figure 6D), and bZIP3 to CCACGTGGCC (Figure 6E), TGTCCATGTCGACGGCTCATG (Figure 6F), and C3H1 to GAAGAAAAAGTTAC (Figure 6G). Notably, all predicted interactions involved key arginine (Arg) residues within the DNA-binding domains of the respective TFs, suggesting a conserved mode of promoter recognition and potential transcriptional activation.

Together, these results suggest potential binding interaction between ERF1, C3H1, bZIP1, and bZIP3 and specific cis-regulatory motifs within the MaAPS1 promoter (Figure 6D–G). These findings identify candidate TFs that may participate in the transcriptional regulatory network underlying starch synthesis and establish a theoretical basis for future molecular manipulation of AGPase expression to enhance starch accumulation in banana, pending experimental validation.

3. Discussion

Starch synthesis is a fundamental biological process that supports plant growth and development, particularly in crops with high starch accumulation [7]. Extensive research in cereal crops has resulted in the identification of key genes and TFs involved in starch biosynthesis, including OsISA1/2 [25], OsRESR1 [23], TaDL/B3 [26], IbGBSSI [27], ZmSSRP1 [28], OsMYBR1 [29], and ZmARF27 [30]. However, our understanding of the genetic regulation of starch synthesis in banana (M. acuminata) remains limited [31], with only a few genes such as MaSBE2.3 [32], MaSSIII-1 [33], and MaGBSSI-3 [34] previously characterized.

Among the starch biosynthetic genes, AGPase plays a pivotal, rate-limiting role in catalyzing the conversion of glucose-1-phosphate (Glc-1-P) and ATP into ADP-glucose (ADP-Glc), the key precursor for starch synthesis [35]. In this study, we identified eight AGPase family members in the M. acuminata genome, all of which contain the conserved ADP-glucose pyrophosphorylase domain, consistent with findings in soybean (G. max) [36], sweet potato (Ipomoea batatas) [37], pear (Pyrus bretschneideri) [38], and potato (S. tuberosum) [39]. Orthologous gene clustering classified these into two major orthogroups, OG0 and OG1. Phylogenetic analysis revealed that AGPase genes have a deep evolutionary origin, dating back to the early-diverging land plants, indicating evolutionary conservation of this gene family.

Gene duplication analysis revealed that WGD was the predominant force driving AGPase gene family expansion in banana, consistent with patterns observed in other plant species. The MaAPS1 gene, in particular, exhibited characteristics of regulatory divergence following polyploidization, including an increased number of introns, more complex promoter architecture with diverse cis-elements, and elevated expression levels. These features are consistent with functional differentiation, although they do not imply neofunctionalization.

Previous studies in rice have shown that AGPase activity directly impacts the efficiency of starch accumulation in endosperm tissues [21]. Mutations in AGPase genes can enhance starch content while reducing soluble sugars. However, the in vivo function of AGPase in banana fruit development has remained less well defined. Here, transient silencing of MaAPS1 resulted in reduced total starch, amylose, and amylopectin contents in banana fruits, supporting its involvement in starch biosynthesis. In contrast, transient overexpression of MaAPS1 did not lead to proportional increase in starch content, suggesting that AGPase activity may be constrained by subunit stoichiometry or regulatory balance [40,41]. Taken together, these findings indicate that appropriate regulation of MaAPS1 expression is important for maintaining starch metabolic homeostasis.

Transcriptional regulation is essential for coordinating starch biosynthetic gene expression. Among known regulators, bZIP TFs have been shown to enhance starch biosynthesis in various species [42,43]. In cassava (Manihot esculenta), silencing of bZIP2 downregulates genes such as APL1, ISA1, and GBSSI, thereby reducing starch content [43]. In rice, the ERF44 TF promotes starch accumulation by activating GBSSI, SSI, BEIIb, ISA2, and ISA3 [29]. Similarly, the AP2/ERF family member ZREB167 influences starch biosynthesis in maize through regulation of sugar transporter genes SUT2/4 [44], while the zinc finger protein ZFP2 contributes to grain development in maize [45]. In this study, we identified four candidate TFs—bZIP1, bZIP3, ERF1, and C3H1—with predicted binding affinity to the MaAPS1 promoter. Expression analysis showed that these TFs are co-expressed with MaAPS1 during early fruit development. Furthermore, molecular docking simulations confirmed the potential for direct binding to the promoter region, suggesting that these TFs may modulate MaAPS1 expression and thus influence starch accumulation in banana fruits. It should be noted that these interactions are based on computational prediction and expression correlation, and direct physical binding has not yet been experimentally validated. Nevertheless, the observed co-expression patterns during early fruit development suggest a possible regulatory association between these TFs and MaAPS1 expression.

Overall, this study provides a comprehensive analysis of the AGPase gene family in banana, highlighting its evolutionary conservation, polyploidy-associated expression, and functional relevance to starch metabolism. The identification of MaAPS1 as a gene with distinct structural, expression, and functional characteristics adds to current understanding of starch biosynthesis regulation in banana fruit. While further experimental validation is required, these findings provide a framework for future studies aimed at improving starch content and fruit quality in banana through molecular and genomic approaches.

4. Materials and Methods

4.1. Data Source and Sequence Retrieval

The genomic sequences of the species analyzed in this study were obtained from the following sources. The genome data for banana (M. acuminata) were retrieved from the Banana Genome Database [7]. Genomic data for the remaining 30 plant species were downloaded through the Ensembl Plants database “https://plants.ensembl.org/info/data/ftp/ (accessed on 3 March 2025)”. Transcriptome datasets representing various developmental stages and tissue-specific expression profiles of M. acuminata (BioProject ID: PRJNA432894) were acquired from the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) [46].

To identify AGPase gene family members, Hidden Markov Model (HMM)-based sequence retrieval was conducted across all 31 genomes using HMMER v3.1 [47], employing an e-value threshold of ≤1 × 10^−5^. The HMM profile corresponding to the AGPase family was used as the query for this search.

4.2. Genome-Wide Identification of AGPase

The HMM profile corresponding to the conserved core domain of AGPase—NTP_transferase—was obtained from the Pfam database [48]. Initial candidate gene screening was carried out using HMMER. To ensure the accuracy of domain identification and the integrity of protein sequences, all predicted AGPase candidates were subsequently validated by integrating information from multiple databases, including SMART [49], Pfam [48], and InterPro [50].

4.3. Orthology Inference and Motif Composition Analysis

Orthologous groups of AGPase proteins across the 31 plant species were inferred using OrthoFinder2 [51]. A single representative sequence per species was selected for downstream phylogenetic and comparative analysis. The MUSCLE v3.8.31 software was used to perform the multiple sequence alignments. Based on the alignment results, a phylogenetic tree and orthologous relationship visualization were generated using the R package ggtree [52].

To investigate functional domains, all AGPase amino acid sequences were queried against the Pfam and Conserved Domain Database (CDD). Regions lacking known motifs were further analyzed using MEME v4.9.0 [53] for de novo motif discovery, with default parameters applied.

4.4. Collinearity and Comparative Genomic Analysis

Intra- and inter-species syntenic relationships were visualized using Dual Synteny Plotter [54]. BLASTP alignments, combined with the MCScan algorithm [55], were used to identify syntenic blocks. Protein sequences of collinear gene pairs were aligned using MUSCLE, and corresponding coding sequences (CDS) were aligned accordingly. The ratio of non-synonymous (K_a_) to synonymous (K_s_) substitution rates was calculated by the Ka/Ks Calculator [56] to assess selection pressure acting on AGPase gene pairs.

4.5. Classification of Gene Duplication Events

To identify duplication events, gene collinearity analysis was conducted using BLAST (v 2.16.1) and MCScanX (v 1.0.1) [57]. The duplicate_gene_classifier module [58] was employed to categorize AGPase genes into duplication types such as WGD, tandem, dispersed, and proximal duplications. These data were interpreted within the context of established paleopolyploidy frameworks [59,60] to infer the contribution of polyploidization to AGPase gene expansion.

4.6. Transcriptomic Analysis

The banana cultivar ‘Baxi Jiao’ (M. acuminata AAA genotype) was planted in the Banana Germplasm Nursery located in the Danzhou City, Hainan Province, China. An RNAprep Pure Plant Kit (supplied by Tiangen, Beijing, China) was used to extract RNA from the roots, leaves, fruit, and pulps at 0, 20, and 80 DAF and ripening stages at 8 and 14 DPH. Deep sequencing (Illumina, Inc., San Diego, CA, USA) was performed using an Illumina GAII platform. The FastQC software (v 0.12) and FASTX toolkit were used to delete the low-quality reads and adapter sequences. Cufflinks was used to complete the transcriptome assemblies. RPKM value represented the gene expression levels. The differentially expressed genes (DEGs) were screened by the DESeq package. Two technical replicates and three biological replicates were evaluated in the sequencing process. According to the RPKM value of the MaAGPases, a heatmap was constructed using MeV 4.9.0 software. The transcriptome datasets generated in this study have been deposited in the NCBI Sequence Read Archive (SRA) under accession numbers SRX3938704, SRX3938706, SRX3938707, SRX3938708, SRX3938709, SRX3938715, and SRX3938722, within BioProject PRJNA432894 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA432894, accessed on 3 March 2025).

4.7. Subcellular Localization of MaAPS1

The open reading frames of MaAPS1 were cloned into the pCAMBIA3300-GFP vector to generate the MaAPS1-GFP fusion protein. Transformation of recombinant plasmids and marker plasmids was conducted using the Agrobacterium tumefaciens strain (LBA4404). The Agrobacterium-mediated transformation was performed by utilizing N. benthamiana leaves [4]. A CLSM fluorescence microscope (Nikon Corporation, Tokyo, Japan) was used to observe the fluorescence after incubation for 48 h at 25 °C.

4.8. Transient Overexpression and Silencing of MaAPS1 in Banana Fruit

The open reading frame sequence of MaAPS1 was cloned into the pCAMBIA3300 vector by utilizing Xba I and Kpn I restriction enzymes. The C-terminal activation domain of MaAPS1 was constructed into the pTRV2 vector using the same Xba I and Kpn I enzymes. The A. tumefaciens strain GV3101 was used to transfer into the constructed plasmids. Thin pieces from banana fruits at 80 DAF were soaked in Agrobacterium solution (OD_600_ = 0.6), then placed on MS medium at 30 °C for 3 d [4]. Afterward, I_2_-KI staining and the contents of total starch, amylopectin, and amylose were measured [4]. The entire experiment was repeated independently three times.

4.9. Quantitative Reverse Transcriptase PCR (RT-qPCR) Analysis

Total RNA was extracted using TRIzol^®^ Reagent (Takara Bio Inc., Shiga, Japan). First-strand cDNA was synthesized using the PrimeScript™ RT reagent Kit with gDNA Eraser (Takara Bio Inc., Shiga, Japan), following the manufacturer’s instructions. RT-qPCR was then performed on a qTOWER3G system using the SYBR^®^ Premix Ex Taq™ (Tli RNaseH Plus) kit (Takara Bio Inc., Shiga, Japan). All primer sequences used in this study are listed in Table S10, with Actin (EF672732) and UBQ2 (HQ853254) serving as internal reference genes. This study included three independent biological replicates. The relative expression level of each target gene was then calculated using the 2^−ΔΔCT^ method [61].

4.10. Prediction of Transcriptional Regulators and Molecular Docking of MaAPS1

Putative TFs binding to the promoter region of MaAPS1 were predicted using the Plant Transcription Factor Database (PlantTFDB v5.0). Protein-DNA interaction modeling between identified TFs and cis-regulatory motifs within the MaAPS1 promoter was simulated using AlphaFold3 [62]. The resulting docking models were visualized in three dimensions using PyMOL v2.5.2 [63] to elucidate potential regulatory mechanisms.

4.11. Statistical Analysis

The R v4.4.2 software was used for data analyses. For datasets involving multiple comparisons, statistical significance was determined using the Least Significant Difference (LSD) test at a confidence level of α = 0.05. Data visualization was carried out using relevant R packages to generate plots and figures.

Bibliography63

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Zeeman S.C. Kossmann J. Smith A.M. Starch: Its metabolism, evolution, and biotechnological modification in plants Annu. Rev. Plant Biol.20106120923410.1146/annurev-arplant-042809-11230120192737 · doi ↗ · pubmed ↗
2Toyosawa Y. Kawagoe Y. Matsushima R. Crofts N. Ogawa M. Fukuda M. Kumamaru T. Okazaki Y. Kusano M. Saito K. Deficiency of starch synthase II Ia and I Vb alters starch granule morphology from polyhedral to spherical in rice endosperm Plant Physiol.20161701255127010.1104/pp.15.0123226747287 PMC 4775109 · doi ↗ · pubmed ↗
3Jourda C. Cardi C. Gibert O. Giraldo Toro A. Ricci J. Mbéguié-A-MbéguiéD. Yahiaoui N. Lineage-specific evolutionary histories and regulation of major starch metabolism genes during banana ripening Front. Plant Sci.20167177810.3389/fpls.2016.0177827994606 PMC 5133247 · doi ↗ · pubmed ↗
4Miao H.X. Sun P.G. Zhu W.N. Liu Q. Zhang J.B. Jia C.H. Sun J.M. Zhu Z. Xie J.H. Wang W. Exploring the function of Ma PHO 1 in starch degradation and its protein interactions in postharvest banana fruits Postharvest Biol. Technol.202420911268710.1016/j.postharvbio.2023.112687 · doi ↗
5Luo T.T. Zhang H. Tan H.K. Zhang L.T. Wei W. Shan W. Kuang J.F. Chen J.Y. Lu W.J. Yang Y.Y. Two MYB transcription factors interact to inhibit the expression of cell wall metabolism and starch degradation genes in banana Plant Physiol.2025623910.1093/plphys/kiaf 23940478863 · doi ↗ · pubmed ↗
6Gibert O. Dufour D. Giraldo A. Sánchez T. Reynes M. Pain J.P. González A. Fernández A. Díaz A. Differentiation between cooking bananas and dessert bananas. 1. Morphological and compositional characterization of cultivated Colombian Musaceae (Musa spp.) in relation to consumer preferences J. Agric. Food Chem.2010577857786910.1021/jf 901788 x 19691321 · doi ↗ · pubmed ↗
7D′Hont A. Denoeud F. Aury J.M. Baurens F.C. Carreel F. Garsmeur O. Noel B. Bocs S. Droc G. Rouard M. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants Nature 201248821321710.1038/nature 1124122801500 · doi ↗ · pubmed ↗
8Batra R. Saripalli G. Mohan A. Gupta S. Gill K.S. Varadwaj P.K. Balyan H.S. Gupta P.K. Comparative analysis of AG Pase genes and encoded proteins in eight monocots and three dicots with emphasis on wheat Front. Plant Sci.20172481910.3389/fpls.2017.00019 PMC 525968728174576 · doi ↗ · pubmed ↗