# The reference genome sequence of the scarlet follicle, Sterculia lanceolata, reveals a paleo-polyploidization and its impact on fruit quality and fruit dehiscence

**Authors:** Youtao Hu, Youpeng Zhang, Hongbin Zhang, Jiahao Zhang, Guilian Guo, Kejing Yang, Bing-Yan Shao, Jia-Yu Xue, Robert Henry, Wenquan Wang, Fei Chen

PMC · DOI: 10.1007/s10142-026-01829-9 · 2026-02-16

## TL;DR

This study provides a high-quality genome sequence for Sterculia lanceolata, revealing its evolutionary history and genes linked to fruit traits.

## Contribution

The first chromosome-level genome assembly of Sterculia lanceolata, uncovering paleo-polyploidization and its impact on fruit quality.

## Key findings

- A high-quality genome assembly of 602.8 Mb with 98.7% BUSCO completeness was achieved.
- Ancient whole-genome duplication events were identified, influencing karyotype evolution in Malvaceae.
- Key gene families related to fruit dehiscence were found to have expanded due to polyploidy.

## Abstract

Sterculia lanceolata, a tree species of the Malvaceae family with notable ornamental and medicinal value, has long been constrained in genetic research and breeding applications due to the lack of genomic resources. In this study, we report for the first time a high-quality, chromosome-level genome assembly of this species, aimed at elucidating its evolutionary history and the genetic basis of key traits. We constructed the genome using PacBio HiFi sequencing and further assembled it into 20 pseudochromosomes with the aid of Hi-C technology, yielding a final genome assembly size of 602.8 Mb with a contig N50 of 29.3 Mb and a BUSCO completeness of 98.7%. The assembly includes the identification of 20 pseudochromosomes and the annotation of 35,873 protein-coding genes, with an annotation rate of 96.4%. By integrating genomic data from other Malvaceae species, we analyzed the karyotype evolution of S. lanceolata and revealed the basal ploidy level of the family. Comparative genomic analyses uncovered significant syntenic relationships and whole-genome duplication (WGD) events among Malvaceae species, thereby clarifying the trajectory of karyotype evolution. Moreover, the study identified key regulatory gene families associated with fruit dehiscence (homologs of SHP1/2, FUL, IND, and ALC) that have undergone extensive expansion in S. lanceolata as a consequence of ancient polyploidy events. The reference genome provided in this study not only serves as a critical resource for evolutionary research in Malvaceae but also establishes a foundational framework for molecular breeding, genetic improvement, and conservation of S. lanceolata and related species.

## Linked entities

- **Genes:** SHP1_2 (protein phosphatase regulator) [NCBI Gene 95986641], ful (fuliginosus) [NCBI Gene 250660], Ind (intermediate neuroblasts defective) [NCBI Gene 641530], ALLC (allantoicase) [NCBI Gene 55821]
- **Species:** Sterculia lanceolata (taxon 190249), Malvaceae (taxon 3629)

## Full-text entities

- **Genes:** ALLC (allantoicase) [NCBI Gene 55821] {aka ALC}, SHP2 (K-box region and MADS-box transcription factor family protein) [NCBI Gene 818883] {aka AGAMOUS-like 5, AGL5, F7D19.17, F7D19_17, SHATTERPROOF 2}, AGL8 (AGAMOUS-like 8) [NCBI Gene 836212] {aka AGAMOUS-like 8, FRUITFULL, FUL, MSL3.30, MSL3_30}, IND (basic helix-loop-helix (bHLH) DNA-binding superfamily protein) [NCBI Gene 827911] {aka EDA33, EMBRYO SAC DEVELOPMENT ARREST 33, F6N15.18, F6N15_18, GT140, IND1}, ALC (basic helix-loop-helix (bHLH) DNA-binding superfamily protein) [NCBI Gene 836846] {aka ALCATRAZ, K21H1.7, K21H1_7}, SHP1 (K-box region and MADS-box transcription factor family protein) [NCBI Gene 825047] {aka AGAMOUS-like 1, AGL1, SHATTERPROOF 1}
- **Diseases:** trauma (MESH:D014947), inflammatory (MESH:D007249), dehiscence (MESH:D013529), pain (MESH:D010146), swelling (MESH:D004487), blood stasis (MESH:D014647), bruising (MESH:D003288)
- **Chemicals:** amylose (MESH:D000688), acid (MESH:D000143), P (MESH:D010758), nitrogen (MESH:D009584), K (MESH:D011188), starch (MESH:D013213), Flavonoids (MESH:D005419), Mg (MESH:D008274)
- **Species:** Syzygium samarangense (Java-apple, species) [taxon 260143], Hibiscus schizopetalus (species) [taxon 1109428], Malus domestica (apple, species) [taxon 3750], Theobroma cacao (cacao, species) [taxon 3641], Gossypium hirsutum (American cotton, species) [taxon 3635], Firmiana kwangsiensis (species) [taxon 1863013], Decalobanthus boisianus (species) [taxon 1835031], Ficus altissima (council-tree, species) [taxon 309270], Brassica napus (oilseed rape, species) [taxon 3708], Spathodea campanulata (African tulip tree, species) [taxon 211926], Durio zibethinus (durian, species) [taxon 66656], Arabidopsis thaliana (mouse-ear cress, species) [taxon 3702], Gossypium raimondii (Peruvian cotton, species) [taxon 29730], Bombax ceiba (Indian kapok, species) [taxon 45325], Camellia sinensis (black tea, species) [taxon 4442], Solanum tuberosum (potatoes, species) [taxon 4113], Sterculia lanceolata (species) [taxon 190249]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12909369/full.md

---
Source: https://tomesphere.com/paper/PMC12909369