Complete chloroplast genome data of Plumbago auriculata f. alba (Pasq.) Z. X. Peng, a medicinal and ornamental species
Jiaqiang Zhang, Danbin Xu, Songlin Wang, Shizhen Wang, Yuan Zhou, Bowei He, Juan Zhang

TL;DR
This paper presents the complete chloroplast genome of a medicinal and ornamental plant, providing insights into its genetic structure and evolutionary relationships.
Contribution
The study provides the first complete chloroplast genome sequence for Plumbago auriculata f. alba, enriching genetic resources for molecular breeding and phylogenetic research.
Findings
The chloroplast genome is 169,357 bp long with a typical quadripartite structure.
The genome contains 129 annotated genes and 104 SSR loci, dominated by mononucleotide repeats.
Phylogenetic analysis shows P. auriculata f. alba is closely related to Plumbago auriculata with 100% bootstrap support.
Abstract
Plumbago auriculata f. alba (Pasq.) Z. X. Peng, a perennial evergreen shrub, is evaluated for both its medicinal properties and ornamental value. It is usually used to treat inflammatory disorders and skin infections, while its white flowers make it a popular landscape plant. In this study, the complete chloroplast genome of Plumbago auriculata f. alba (Pasq.) Z. X. Peng was sequenced using the Illumina HiSeq platform. The circular chloroplast genome has a total length of 169,357 (bp), exhibiting the typical quadripartite structure of angiosperms: a large single-copy (LSC) region of 92,056 bp, a small single-copy (SSC) region of 13,321 bp, and two inverted repeat (IR) regions (IRa and IRb) each of 31,990 bp. A total of 129 genes were annotated, including 84 protein-coding genes (PCGs), 37 transfer RNA (tRNA) genes, and 8 ribosomal RNA (rRNA) genes. Simple sequence repeat (SSR) analysis…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Medicinal Plant Research · Echinoderm biology and ecology
Specifications TableSubjectBiologySpecific subject areaOmics: Chloroplast GenomicsType of dataTables, Figures, Sequencing Raw Reads, Genome Assembly, Annotation.Data collectionFresh leaves collected, total genomic DNA extracted. Chloroplast genome were sequenced using the Illumina HiSeq 2500 platform (Illumina, USA). Annotation performed using CPGAVAS2 with manual curation. Genome visualized using OGDRAW. SSR analysis using MISA. Phylogenetic analysis using MAFFT and MEGA with Neighbor-Joining.Data source locationCity: Hangzhou City, Zhejiang ProvinceCountry: ChinaLatitude and Longitude: 120°23′E, 30°08′NVoucher Specimen: Deposited at Zhejiang Institute of Landscape Plants and Flowers Herbarium under voucher number XHD2024090403 (Curator: Qiang Chang, Email: [email protected]).Data accessibilityRepository name: NCBI (National Center for Biotechnology Information)BioProject: PRJNA1303398 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1303398/)GenBank Accession Number: PX315800.1 (https://www.ncbi.nlm.nih.gov/nuccore/PX315800.1/)Related research article‘none’.
Value of the Data
1
- •Plumbago,auriculata f. alba (Pasq.) Z. X. Peng is a dual-purpose species with medicinal and ornamental importance. Its complete chloroplast genome provides a reference for distinguishing it from congeneric varieties (e.g., Plumbago auriculata Lam with blue flowers) at the molecular level.
- •The annotated chloroplast genome and SSR markers can be used to evaluate genetic diversity of Plumbago populations, supporting conservation and rational utilization of this medicinal resource.
- •Phylogenetic data specify the evolutionary position of P. auriculata f. alba (Pasq.) Z. X. Peng in Plumbaginaceae, laying a foundation for interspecific hybridization and ornamental trait improvement.
Background
2
Plumbago auriculata Lam. (Plumbaginaceae), referred to as Cape leadwort, is native to South Africa and has been widely introduced in tropical and subtropical regions of China (e.g., Yunnan, Guangdong) [1,2,3]. Plumbago auriculata f. alba (Pasq.) Z. X. Peng is a white-flowered variant, distinguished from the typical blue-flowered Plumbago auriculata Lam., by petal color [4] (Fig. 1). The Plumbago genus is renowned for its medicinal properties. It contains various phytochemicals, with plumbagin being a key bioactive compound [1,5]. Plumbagin has demonstrated various pharmacological activities [6]. A comparative study of Plumbago zeylanica and Eclipta alba highlighted the antioxidant potential of Plumbago, linking it to the presence of alkaloids, saponins, tannins, and phenolic compounds [7]. Research has also investigated the antiobesity and antioxidant properties of other Plumbago species, such as Plumbago europaea and Plumbago auriculata, finding them to contain lipase inhibitors [2,3]. Further studies are required to fully characterize the therapeutic potentials of P. auriculata f. alba (Pasq.) Z. X. Peng. A study compared the phytochemical composition and anti-fibrotic activities of Plumbago indica and Plumbago auriculata [7]. Ornamentally, P. auriculata f. alba (Pasq.) Z. X. Peng is a beautiful and versatile shrub prized for its pure white flowers and long blooming season. Its ease of cultivation, coupled with its ornamental and potential medicinal value, turns it a popular choice for gardens and landscapes in suitable climates [1,8].Fig. 1. The studied Plumbago plant species, where a represents Plumbago auriculata f. *alba (*Pasq.) Z. X. Peng (A) and b represents Plumbago auriculata Lam(B).Fig 1 dummy alt text
However, P. auriculata f. alba (Pasq.) Z. X. Peng and P. auriculata Lam. exhibit striking morphological similarities, differing primarily in flower color. The difficulty in distinguishing their variations stems from the medicinal parts being roots and rhizomes, which further complicates identification and often leads to confusion [3]. The Flora of China contains only a single description for P. auriculata f. alba (Pasq.) Z. X. Peng, noting that this variant displays white corollas [4]. These limitations result in empirical-based identification methods with low accuracy rates, as the distinctive features of medicinal parts are not easily observable [9]. Moreover, conventional molecular markers lack sufficient resolution to differentiate closely related plants within the Plumbaginaceae, making reliable identification criteria unattainable.
Chloroplast genomes are widely used in plant phylogenetics, molecular breeding, and species identification owing to their maternal inheritance, conserved structure, and moderate evolutionary rate [10]. Recently, comparative genomics based on complete chloroplast genomes has become a powerful tool for resolving difficult plant phylogenetic questions, providing high-resolution insights even in complex early-angiosperm lineages [11]. In recent years, P. auriculata f. alba (Pasq.) Z. X. Peng is characterised by its distinctive white flowers, evergreen shrub form, and genetic variability, with research emphasizing its hybridization potential and underlying metabolic pathways influencing flower color [8,12,13]. While some genetic studies exist [10,14], the genomic resources for P. auriculata f. alba (Pasq.) Z. X. Peng remain limited. This study reports the complete chloroplast genome of P. auriculata f. alba (Pasq.) Z. X. Peng, aiming to fill this gap and support future research on its genetics and utilization.
Data Description
3
Chloroplast genome structure
3.1
The complete chloroplast genome of P. auriculata f. alba (Pasq.) Z. X. Peng is a circular molecule of 169,357 bp, with a GC content of 37.12 %. The quadripartite structure includes: a large single-copy (LSC) region of 92,056 bp, a small single-copy (SSC) region of 13,321 bp, and two inverted repeat (IR) regions (IRa and IRb) each of 31,990 bp (Fig. 2).Fig. 2. Circular map of the P. auriculata f. alba (Pasq.) Z. X. Peng chloroplast genome.Fig 2 dummy alt text
A total of 129 unique genes were annotated (Table 1), including: 84 PCGs, 37 tRNA genes, 8 rRNA genes. 9 genes contain one intron (atpF, ndhA, ndhB, rps16, rpoC1, rpl16, rpl2, petB, petD), 3 genes (rps12, ycf3, clpP) contain two intron in these genes.Table 1. Gene content in the chloroplast genome of P. auriculata f. alba (Pasq.) Z. X. Peng.Table 1 dummy alt textCategoryGene groupGene namePhotosynthesisSubunits of photosystem IpsaA,psaB,psaC,psaI,psaJSubunits of photosystem IIpsbA,psbB,psbC,psbD,psbE,psbF,psbH,psbI,psbJ,psbK,psbM,psbN,psbT,psbZSubunits of NADH dehydrogenasendhA,ndhB**(2),ndhC,ndhD,ndhE,ndhF,ndhG,ndhH,ndhI,ndhJ,ndhKSubunits of cytochrome b/f complexpetA,petB,petD*,petG,petL,petNSubunits of ATP synthaseatpA,atpB,atpE,atpF*,atpH,atpILarge subunit of rubiscorbcLSubunits photochlorophyllide reductase-Self-replicationProteins of large ribosomal subunitrpl14,rpl16*,rpl2**(2),rpl20,rpl22,rpl32,rpl33,rpl36Proteins of small ribosomal subunitrps11,rps12**(2),rps14,rps15,rps16,rps18,rps19,rps2,rps3,rps4,rps7*(2),rps8Subunits of RNA polymeraserpoA,rpoB,rpoC1,rpoC2Ribosomal RNAsrrn16*(2),rrn23(2),rrn4.5(2),rrn5(2)Transfer RNAstrnA-UGC**(2),trnC-GCA,trnD-GUC,trnE-UUC,trnF-GAA,trnG-GCC,trnG-UCC*,trnH-GUG,trnI-CAU*(2),trnI-GAU**(2),trnK-UUU*,trnL-CAA*(2*),trnL-UAA*,trnL-UAG,trnM-CAU,trnN-GUU*(2),trnP-UGG,trnQ-UUG,trnR-ACG(2),trnR-UCU,trnS-GCU,trnS-GGA,trnS-UGA,trnT-GGU,trnT-UGU,trnV-GAC(2),trnV-UAC,trnW-CCA,trnY-GUA,trnfM-CAUOther genesMaturasematKProteaseclpPEnvelope membrane proteincemAAcetyl-CoA carboxylaseaccDc-type cytochrome synthesis geneccsATranslation initiation factorinfAother-Genes of unknown functionConserved hypothetical chloroplast ORFycf1*(2),ycf2(2)*,ycf3,ycf4Notes: Gene: Gene with one intron; Gene**: Gene with two intron; #Gene: Pseudo gene; Gene(2): Number of copies of multi-copy genes.
Comparative analysis of chloroplast genomes
3.2
A total of 104 simple sequence repeat (SSR) loci were identified, mononucleotide repeats (73 loci, 70.19 %) were dominant, trinucleotide repeats followed (13 loci, 12.50 %), dinucleotide repeats ranked third (12 loci, 11.54 %), hexanucleotide repeats ranked last(2 loci, 1.92 %), while no pentanucleotide repeats were detected (Table 2).Table 2. Statistics of different repeat types of SSRs from the chloroplast genome of P. auriculata f. alba (Pasq.) Z. X. Peng.Table 2 dummy alt textTypeCountsLength (bp)Percent ( %)Average Length (bp)Relative Abundance (loci/Mb)Relative Density (bp/Mb)Mono7380170.1910.97431.044729.65Di1213411.5411.1770.86791.23Tri1315612.501276.76921.13Tetra4483.851223.62283.42Hexa2361.921811.81212.57
In codon usage preferences (Fig. 3), Arg with AGA exhibited the highest relative synonymous codon usage value (RSCU=1.80), followed by Leu with UUA (RSCU=1.58). Among the total codons, 30 displayed RSCU values being greater than 1, accounting for approximately 46.88 % of all codons. Notably, these preferred codons uniformly terminated with either adenine or uracil bases, detecting a pronounced bias toward A- and U-ending codons. This observed preference likely represents a molecular adaptation strategy developed through long-term evolutionary processes to optimize gene expression efficiency.Fig. 3RSCU value of all codon in the chloroplast genome of P. auriculata f. alba (Pasq.) Z. X. Peng.Fig 3 dummy alt text
Phylogenetic tree showed that all Plumbaginaceae species formed a monophyletic clade (bootstrap = 100 %) (Fig. 4). P. auriculata f. alba (Pasq.) Z. X. Peng clustered with P. auriculata, P. indica and P. zeylanica. The phylogenetic tree strongly supports that P. auriculata f. alba (Pasq.) Z. X. Peng has the closest species relationship with P. auriculata and belongs to the same evolutionary lineage. Triticum aestivum is regarded as an outgroup, the presence of Lonicera japonica, Gossypium thurberi, Abelmoschus manihot, and three species of the Paeonia genus (Paeonia emodi, Paeonia suffruticosa, Paeonia ostii) provides cross-order and cross-family taxonomic samples for phylogenetic analysis, and places the core clades of Plumbaginaceae within a broader true dicotyledonous plant system, thereby enhancing the reliability and biological significance of the analysis.Fig. 4. Neighbor-Joining (NJ) phylogenetic tree based on complete chloroplast genome sequences. ▲denotes species analysed in this study.Fig 4 dummy alt text
All species consistently exhibit a quadripartite chloroplast genome structure composed of LSC-IRb-SSC-IRa,'' with the inverted repeat (IR) regions displaying symmetry and highly conserved genes([Fig. 5](#fig0005)). Such as *rpl2*, indicating the relative structural conservation of chloroplast genomes within the genus Plumbago and aligning with the typical characteristics of angiosperm chloroplast genomes. Differences in the length of *ycf1* at the SSC-IRa boundary and *rps19* at the LSC-IRb boundary reflect the expansion/contraction'' of the IR regions, a primary mechanism of chloroplast genome evolution. These variations may serve as critical molecular markers for species differentiation and phylogenetic analysis within Plumbago, providing essential evidence for species identification and evolutionary studies of Plumbago plants.Fig. 5. Chloroplast genome boundary analysis of 5 species in Plumbago.Fig 5 dummy alt text
Experimental Design, Materials and Methods
4
Plant materials and DNA extraction
4.1
Fresh, young leaves of P. auriculata f. alba (Pasq.) Z. X. Peng were collected from cultivated specimens located in Hangzhou, Zhejiang Province, China (120°23′E, 30°08′N). A voucher specimen (XHD2024090403) was deposited at the Zhejiang Institute of Landscape Plants and Flowers Herbarium. Total genomic DNA was extracted from approximately 100 mg of silica-gel-dried leaf tissue using a modified CTAB protocol [15]. DNA quality and quantity were measured using a NanoDrop spectrophotometer and 1 % agarose gel electrophoresis.
Sequencing and sequence analyses
4.2
Libraries were constructed using the TruSeq DNA Sample Preparation Kit (Vanzyme, China) with transposase-mediated fragmentation (default=sonication): 50 ng DNA was fragmented with transposase for 30 min (default=20 min) to target 300 bp, libraries were amplified with 12 PCR cycles (default=15 cycles) to introduce index tags, and fragments of 200–400 bp were selected via Agencourt SPRIselect Beads (0.6 × + 1.8 × ratios), with validation via Qubit 4 (≥2 nM) and Agilent 2100 Bioanalyzer (peak ≈ 300 bp); sequencing was performed on the Illumina HiSeq 2500 platform (2 × 150 bp paired-end), yielding 36,190,878 raw reads (5.37 Gb total bases). Raw reads were filtered via FastQC v0.11.8 [16] and fastp v0.19.5 with non-default parameters (remove reads with ≥3 N bases, 3′ ends with Q<20, <60 % bases with Q≥20, or length <60 bp), resulting in 35,018,194 high-quality reads (Q20=96.0 %, Q30 = 91.1 %, GC=47.0 %) and chloroplast read purity confirmed via Kraken2 v2.0.9. Chloroplast genome assembly was conducted with MetaSPAdes v3.13.0 (k-mer=21/33/55/77, minimum contig length=500 bp), yielding 869 contigs (total length 976,915 bp, N50=1070 bp), followed by reference-guided correction via Blast+ v2.9.0 (-evalue 1e-10, -ungapped) against the P. auriculata chloroplast genome (GenBank: MH286308.1) to correct contig orientation, fill gaps, and verify circularization, resulting in a single circular genome (169,357 bp, GC=37.12 %, Ns=0.00 %). The genome was annotated via CPGAVAS2 [17] with manual correction using Blast+ v2.9.0, yielding 84 total genes (82 complete, 1 low-similarity ndhD, 0 missing), and a circular gene map was generated via OGDRAW v1.3.1 [18]; post-assembly validation included coverage depth analysis via BWA v0.7.17 (all positions ≥30 × coverage) and collinearity analysis via Circos v0.69-6 (100 % collinearity with the reference genome, no rearrangements). All tools, versions, and non-default parameters are provided for reproducibility.
SSRs and RSCU analysis
4.3
Simple sequence repeats (SSRs) were identified using the MISA web tool [19] with threshold parameters set to 10, 5, 4, 3, 3, and 3 for mono- to hexa-nucleotides, respectively. MEGA 6.0 was used to examine the relative synonymous codon usage (RSCU) values, base composition and codon usage [20].
Phylogenetic analysis
4.4
For phylogenetic analysis, complete chloroplast genome sequences of 20 related species from NCBI. Triticum aestivum (PP829256.1) was used as the outgroup. The sequences were in harmony using MAFFT [21], and a neighbor-Joining (NJ) phylogenetic tree was constructed using MEGA6.0 [20] with 1000 bootstrap replicates.
Limitations
‘None’.
Ethics Statement
All authors have read and follow the ethical requirements for publication in Data in Brief and confirming that the current work does not involve human subjects, animal experiments, or any data collected from social media platforms.
Credit Author Statement
Jiaqiang Zhang: Data curation, Writing-original draft. Danbin Xu: Investigation, Methodology. Songlin Wang: Project administration. Shizhen Wang: Resources, Software, Supervision. Yuan Zhou: Validation, Visualization. Bowei He: Supervision, Writing-review & editing. Juan Zhang: Conceptualization, Funding acquisition, Writing-review & editing.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Singh K.Naidoo Y.Baijnath H.A comprehensive review on the genus Plumbago with focus on Plumbago auriculata (Plumbaginaceae)Afr. J. Tradit. Complement Altern. Med.15201819921510.21010/ajtcam.v 15i 1.21 · doi ↗
- 2Pandey D.K.Katoch K.Das T.Majumder M.Dhama K.Mane A.B.Gopalakrishnan A .V.Dey A.Approaches for in vitro propagation and production of plumbagin in Plumbago spp Appl. Microbiol. Biot.10720234119413210.1007/s 00253-023-12511-637199750 · doi ↗ · pubmed ↗
- 3Intharuksa A.Phrutivorapongkul A.Thongkhao K.Integrating DNA barcoding, microscopic, and chemical analyses for precise identification of Plumbago indica L. A prominent medicinal plant Microchem J 199202411003810.1016/j.microc.2024.110038 · doi ↗
- 4Editorial Committee of Flora Reipublicae Popularis Sinicae, Chinese Academy of Sciences(Eds.)Flora Reipublicae Popularis Sinicae (FRPS 6019877 Part 1Beijing
- 5Saxena A.Gautam S.Arya K.R.Singh R.K.Comparative study of phytochemicals, antioxidative potential & activity of enzymatic antioxidants of eclipta alba and plumbago zeylanica by in vitro assays Free Radicals. Antioxid.6201613914410.5530/fra.2016.2.2 · doi ↗
- 6Thekkumkara S.Longchar A.Venkidasamy B.Kondapavuluri B.K.Thiruvengadam M.Ghorbanpour M.Sankaran S.Different derivatives of plumbagin analogue: bioavailability and their toxicity studies Food Sci. Nutr.132025 e 7072010.1002/fsn 3.70720 PMC 1236539740842665 · doi ↗ · pubmed ↗
- 7Selim N.M.Melk M.M.Melek F.R.Saleh D.O.Sobeh M.El-Hawary S.S.Phytochemical profiling and anti-fibrotic activities of Plumbago indica L. and Plumbago auriculata lam. in thioacetamide-induced liver fibrosis in rats Sci Rep 122022986410.1038/s 41598-022-13718-935701526 PMC 9197831 · doi ↗ · pubmed ↗
- 8Chen X.Gao S.Shen P.Liu Y.Lei T.Shi L.Li W.Li Y.Yu X.Yang L.Li J.Genetic diversity analysis of intraspecific hybridization between Plumbago auriculata and Plumbago auriculata f. alba based on horticultural traits and molecular markers Acta Physiol. Plant.4320213110.1007/s 11738-020-03188-9 · doi ↗
