Assessment of Genetic Diversity and Population Structure in Oil-Bearing Rose Genotypes Using Start Codon-Targeted (SCoT) Markers
Mariya Zhelyazkova, Veselina Badzhelova, Florentina Barbu, Stela Lazarova, Peter Hristov

TL;DR
This study uses SCoT markers to assess genetic diversity in oil-bearing roses, revealing differences among cultivars and highlighting the value of local and traditional rose varieties.
Contribution
The study introduces SCoT markers as an effective tool for analyzing genetic diversity in Rosa damascena and related species.
Findings
SCoT markers revealed significant genetic diversity within and among oil-bearing rose accessions.
The locally improved R. damascena 'Population 5' showed higher genetic diversity than traditional cultivars.
The unidentified Rosa sp. clustered closely with R. gallica, suggesting taxonomic or genetic relationships.
Abstract
The oil-bearing rose (Rosa damascena Mill.), traditionally cultivated in Bulgaria for centuries, and the rose oil produced from it are of major cultural and economic importance. Its distinctive fragrance and rich aromatic profile are highly valued worldwide. In this study, a set of 15 start codon-targeted (SCoT) molecular markers was used to evaluate the genetic diversity and relationships of 38 rose accessions. The analyzed materials included Bulgarian-bred R. damascena cultivars, a locally improved population (‘Population 5’), three oil-bearing species (Rosa alba L., Rosa gallica L., and Rosa centifolia L.), Romanian heritage roses, and an unidentified rose genotype from an old Bulgarian plantation (Rosa sp.). The SCoT primers yielded a cumulative count of 238 bands, with an average of 12.9 bands per primer. The range of diversity markers, such as PIC (0.20–0.78), number of different…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5- —National Science Fund
- —Bulgarian Ministry of Education and Science
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPlant biochemistry and biosynthesis · Powdery Mildew Fungal Diseases · Genetic diversity and population structure
1. Introduction
Studies on genetic diversity and population structure of Rosa species, including cultivated oil-bearing roses, are essential for selection, conservation planning, and the establishment of representative core collections [1,2,3,4]. The identification and preservation of unique genotypes with valuable agronomic or aromatic traits are increasingly important due to recent climate changes, increasing pathogen pressure, and the growing global demand for natural aromatic products [3,5]. Studies aimed at genetic characterization also contribute to clarifying taxonomic uncertainties within the genus, which is well known for its complex evolutionary history, hybridization events, and widespread polyploidy [1,2,3,4,5,6,7]. Moreover, the molecular identification of unknown rose genotypes is becoming particularly important due to the discovery of new or previously neglected populations, especially in regions with rich wild rose diversity [3,4,5,8,9,10,11].
Among the oil-bearing roses, Rosa damascena Mill. is the most widely cultivated hybrid worldwide, followed by Rosa gallica L., Rosa centifolia L., and Rosa alba L [12]. Rosa damascena is also the most economically significant aromatic and oil-bearing rose, valued for its essential oil, a key ingredient in high-end perfumery, cosmetics, and pharmaceutical products. It has a long cultivation history and significant cultural and symbolic value, reflecting the long-standing relationship between humans and the natural environment [13]. Traditionally cultivated in regions such as Bulgaria, Turkey, Iran, and Morocco, R. damascena represents a unique genetic resource shaped by centuries of vegetative propagation, cultural practices, and local environmental conditions [14]. Despite its long history of cultivation, the genetic base of R. damascena has often been described as narrow, largely due to predominantly clonal propagation in industrial oil rose plantations [15]. However, advances in molecular genetics over the past two decades have challenged this assumption by revealing previously unrecognized levels of diversity within and among regional populations [8,9,10,11,16,17,18,19].
Native to Europe and parts of western Asia, R. gallica was among the first rose species domesticated in Europe and is considered a progenitor of many modern cultivated roses [20]. Also known as French or apothecary’s rose, it has been typically grown for both essential oil production and traditional herbal medicine. Its cultivation is more limited today, usually occurring in small-scale plantations, experimental and research collections, and botanical and historical gardens across parts of Europe, including France, Bulgaria, and other temperate regions [21]. The industrial cultivation of R. centifolia is more limited compared with R. damascena. It is a complex hybrid; however, its precise parental lineage is not fully resolved. Molecular phylogenetic analyses across the genus Rosa have shown that cultivated roses, including old garden types like R. centifolia, derive from a mosaic of wild species and multiple hybridization events. Gudin [22] suggests contributions from species such as (R. gallica × Rosa moschata Herrm.) × R. alba, whereas Vukosavljev [23] indicates that it is derived from R. gallica and R. alba or Damask roses. Both R. gallica and R. centifolia have also been the subject of recent scientific studies related to their essential oil and hydrosol compositions, underpinning their value as aromatic raw materials and supporting their continued cultivation for niche products and regional aromatic industries [24,25]. Rosa alba is also a hybrid of Rosa canina L. × R. gallica [22] and is mainly grown for aromatic water and local products in parts of North Africa and Europe, including Bulgaria [26].
Of all the roses discussed above, R. damascena has also been the most intensively studied species, particularly with regard to its genetic diversity. Early studies used Amplified Fragment Length Polymorphism (AFLP), Inter-Simple Sequence Repeat (ISSR), and microsatellite (SSR) markers, which proved effective for distinguishing between closely related genotypes and assessing population structure [8,16,17,19]. ISSR and start codon-targeted (SCoT) markers have been applied to reveal polymorphism even among accessions previously assumed to be genetically uniform [2,18,19]. More recent advances in SNP-based approaches, including high-throughput sequencing, genotyping-by-sequencing (GBS), and genome-wide analyses, have significantly improved the resolution at which genetic variation can be assessed. These methods enable the detection of fine-scale genetic differences, hidden population structure, as well as applications in quantitative trait loci (QTL) mapping and marker-assisted selection [1,14,27,28]. Integration of molecular and chemical analyses has further demonstrated that genetic variability is often associated with distinct chemotypes and essential oil compositions, suggesting potential for targeted selection and regional product differentiation [18,29].
Another major challenge in rose research is the reliable identification of unknown or mislabelled genotypes. Previous studies have reported the presence of unclassified or previously undescribed genotypes of R. damascena with distinct genetic profiles [3,4,8]. These findings highlight the importance of comprehensive and systematic genetic characterization, particularly in regions where traditional cultivation practices or introgression between wild and cultivated populations may have generated genetically divergent lineages [3]. Accurate genotypic classification is also crucial for germplasm conservation, selection programs, and the establishment of authentic geographic indications—an increasingly important issue for countries that rely on rose oil production as a strategic agricultural sector [14,30]. Overall, growing molecular evidence indicates that the genetic diversity of oil-bearing roses is greater and more structured than previously assumed. A comprehensive assessment of this diversity, particularly through the genetic identification of unknown accessions, is therefore essential to enhance conservation efforts, improve selection programs, and fully explore the aromatic and agronomic potential of R. damascena.
Among essential oil crops, the rose occupies a leading position and represents one of the most important industrial crops in Bulgaria. Rose oil and rose-derived products are among the most recognizable and emblematic Bulgarian commodities on global markets. Bulgaria has more than 350 years of tradition in the cultivation of the oil-bearing rose, with the earliest documented evidence of rose gardens dating back to 1712 [31]. Rosa damascena, also known as the Kazanlik oil-bearing rose in Bulgaria, has been the focus of research at the Institute of Roses, Essential and Medical Cultures (IREMC), Kazanlak, for more than 100 years. As a result of many years of selection efforts, a valuable genetic fund has been created, part of which has been preserved to this day, forming the basis of modern Bulgarian rose production.
In this study, we used SCoT molecular markers to (i) evaluate the genetic diversity of 38 Rosa accessions, (ii) clarify the genetic relationships among established R. damascena lineages and related species, and (iii) determine the genetic position of an unclassified Bulgarian rose genotype and three old rose plants from Romania.
SCoT markers have been directly linked to gene function and were effectively employed for genotyping and polymorphism assessment [32]; they have been successfully applied in genetic diversity analyses and diagnostic fingerprinting across a wide range of essential oil crops [2,18,33,34,35,36,37].
2. Materials and Methods
2.1. Plant Material
The plant material was collected in May 2025 and included 35 samples of oil-bearing rose species and cultivars from Bulgaria and three rose accessions from Romania. All Bulgarian plant samples are from the experimental collection of the Institute of Roses, Essential and Medical Cultures (IREMC), Kazanlak. During a field survey conducted in 2023, an old private plantation (over 50 years old) with an unknown oil-bearing rose (Rosa sp.) was discovered near the town of Klisura, Bulgaria. Eight specimens were collected and transferred to the IREMC and clonally propagated for further investigations. Plant material from each specimen was included in the present study (designated as Kll–Kl8). Nine samples (P1–P9) from an improved selection of local populations of Rosa × damascena nothof. × trigintipetala (Dieck) R.Keller (‘Population 5’); four Bulgarian cultivars—‘Iskra’ (I1–I2), ‘Yanina’ (Y1–Y2), ‘Eleina’ (E1–E2), and ‘Svezhen’ (SV1–SV2); three clones from local populations of R.× alba (A1–A3); two specimens (R1–R2) of the Russian cultivar ‘Raduga’, a complex hybrid between a variety of R. gallica and R. damascena; three accessions of R. gallica (G1–G3), and two R. centifolia L. (C1–C2) plants were included in the study.
The Romanian samples comprise two accessions (G4 and D) provisionally identified as R. gallica—one from Comuna Poeni, Teleorman (44°24′ N, 25°20′ E), and the other from Carbunești, Prahova (45°13′14″ N, 26°12′30″ E)—and one accession from R. alba (A4) collected in Carbunești, Prahova (45°13′14″ N, 26°12′30″ E), and introduced in the Faculty of Horticulture, University of Agronomic Sciences and Veterinary Medicine of Bucharest, for further investigation and provided to us for this study. All of these samples originate from old rose plants cultivated in private gardens and used for food products (e.g., jam and herbal tea preparation).
Detailed characteristics of the sampled plants are presented in Table 1. Representative specimens differing in morphology, species, or cultivar are shown in Figure 1.
Descriptions of Rosa accessions: A1–A3 and G1–G3 were provided by V.B. (co-author); G4, D, and A4 by F.B. (co-author); P1–P9 [38] and R1–R2, C1–C2, I1–I2, Y1–Y2, E1–E2, and SV1–SV2 [39] were taken from previously published sources.
2.2. Genomic DNA Isolation
Frozen plant tissue from young leaves was homogenized using ZR BashingBead™ Lysis Tubes (Zymo Research Corp., Irvine, CA, USA) with Lysis Buffer in combination with a Disruptor Genie homogenizer (Scientific Industries, Inc., Bohemia, NY, USA). Genomic DNA was subsequently extracted following the manufacturer’s protocol for the GeneMATRIX Plant & Fungi DNA Purification Kit (EURx Ltd., Gdansk, Poland). All DNA samples were adjusted to a working concentration of 20 ng/µL. DNA purity was assessed using a NanoVue Plus spectrophotometer (GE Healthcare UK Limited, Amersham Place, Little Chalfont, Buckinghamshire, UK), and only samples with an A260/280 ratio between 1.8 and 2.0 were used for downstream analyses. The extracted DNA was stored at −20 °C prior to analysis.
2.3. PCR Amplification and Visualization
A total of 30 SCoT primers designed by Invitrogen (Invitrogen, Darmstadt, Germany) and selected from previous studies of roses [2,40] were initially tested. Fifteen primers that produced clear, reproducible, and polymorphic banding patterns were ultimately used for amplification across all Rosa accessions (Table 2). PCR amplifications were performed in a final volume of 20 µL containing 1 µL genomic DNA (20 ng), 10 µL Red Taq DNA Polymerase 2× Master Mix (1.5 mM MgCl_2_), 1 µL primer (10 pmol), and 8 µL ddH_2_O. All PCR reactions were carried out using a Doppio Gradient 2 × 48-well thermal cycler (VWR^®^, Darmstadt, Germany) under the following conditions: initial denaturation at 94 °C for 5 min; 35 cycles of denaturation at 94 °C for 45 s, primer annealing at 50–58 °C for 45 s, and extension at 72 °C for 90 s; followed by a final extension at 72 °C for 10 min.
PCR products were visualized on 1.7% agarose gels prepared in 1× TBE buffer and stained with GelRed^®^ (Biotium, Fremont, CA, USA) and visualized under UV light using a UV transilluminator (Bio-Imaging System, Modi’in-Maccabim-Re’ut, Israel). Fragment sizes of SCoT-PCR products were estimated using the NZYDNA Ladder VI (NZYtech Lda., Lisbon, Portugal), ranging from 50 to 1500 bp.
2.4. Data Analysis
SCoT-amplified fragments were scored as a binary data matrix, recording the presence (1) or absence (0) of each band. The discriminatory capacity of the primers was evaluated through the calculated values of Polymorphic Information Content (PIC) using the formula PICi = 2fi(1 − fi), where i is the locus, fi is the frequency of the amplified fragments, and (1 − fi) is the frequency of the non-amplified fragments [41]; effective multiplex ratio (EMR), calculated using the formula EMR = n × β, where n = mean number of fragments amplified per primer and β = PB/(PB + MB), where PB represents polymorphic fragments and MB represents monomorphic fragments [42]; marker index (MI) calculated using the formula MI = EMR × PIC [43]; and resolving power (RP) calculated using the formula RP = Σ Ib, where Ib = 1 − (2 |0.5 − pi|) and pi were genotypes showing the presence of the fragment [44].
Genetic diversity per locus/primer, represented by Nei’s gene diversity [45] and Shannon’s information index (I), was estimated using PopGen 1.32.
GenAlEx 6.5 [46] was employed to assess population-level genetic diversity, Principal Coordinate Analysis (PCoA), and an Analysis of Molecular Variance (AMOVA). Genetic differentiation among populations was estimated using ΦPT (PhiPT) (an FST analogue for dominant markers) via AMOVA. The first three principal coordinate axes (PC1 vs. PC2, PC2 vs. PC3) of Genalex were used for visualization in GraphPad Prism 10.6.1. Structure (v 2.3.4) software was used to analyze population structure using a Bayesian mathematical model for calculating individual genetic similarity weight values (Q value) and assessing gene flow [47,48]. The number of clusters (K) was explored across a range from 2 to 11, with 20 independent runs per K value, using a burn-in of 100,000 iterations followed by 10,000 Markov Chain Monte Carlo (MCMC) iterations after burn-in. The optimal K value was determined according to the highest ΔK likelihood, calculated in StructureSelector https://lmme.ac.cn/StructureSelector/ (accessed on 7 October 2025) [49].
A dendrogram based on the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) was constructed in MEGA 12 [50] using the Nei’s genetic distance matrix generated in PopGen 1.32. The resulting cluster tree was graphically refined using iTOL [51].
3. Results
3.1. SCoT-PCR Amplification Results
Fifteen SCoT markers amplified a total of 238 bands, of which 194 were polymorphic and 44 were monomorphic, resulting in an average polymorphism of 89%. Two primers—SCoT 6 and SCoT 21—exhibited a 100% level of polymorphism (Table 3). Across all primers, the number of amplified bands ranged from 2 to 19, with an average of 12.9 bands. The maximum number of polymorphic bands was produced by the primer SCoT 3 (19 bands), followed by SCoT 21 (18 bands), SCoT 11 (17 bands) and SCoT 13 (17 bands). Out of the 15 SCoT primers, SCoT 6 yielded the minimum number of bands (2 bands). Furthermore, the PIC value ranged between 0.78 for SCoT 6 and 0.20 for SCoT 36, with an average PIC value of 0.52 for all tested primers, confirming their high efficiency in detecting polymorphism among the studied genotypes. Additionally, the mean values for Effective Multiplex Ratio (EMR), Resolving Power (RP), and Marker Index (MI) were 18.5 (ranging between 11.1 and 24.4), 19.2 (ranging between 1.9 and 26.5), and 9.4 (ranging between 3.6 and 11.9), respectively, highlighting the strong discriminatory capacity of the selected SCoT markers. A representative DNA fingerprinting pattern generated with primer SCoT 25 is shown in Figure 2, while the patterns obtained with all primers are shown in Supplementary Figure S1.
The maximum number of identified alleles (Na) of 2.00 was observed with SCoT 6 and SCoT 21 primers, followed by 1.95 for SCoT 3 and 1.94 for SCoT 13, while the minimum Na of 1.54 was recorded with SCoT 36 (Table 4). The average Na of 1.81 was recorded for all SCoT primers used. Moreover, the maximum and minimum Shannon’s information index (I) of 0.69 and 0.24 were recorded for SCoT 6 and SCoT 36, respectively, and the average I for all SCoT primers was 0.40. The maximum gene diversity (H) of 0.50 was detected with SCoT 6, while the minimum was 0.15 with SCoT 36, and a mean gene diversity value of 0.26 was observed for all SCoT tested markers.
3.2. Genetic Diversity and Differentiation Among Rosa Accessions
Genetic diversity indices were calculated for all accessions, treating the unidentified Rosa form as a separate taxon (Table 5). Genetic diversity per population varied considerably. The highest number of alleles (Na) was observed in the improved Bulgarian population ‘Population 5’, P1–P9 (1.38), followed by five accessions of R. gallica, G1–G4, D (1.24), and the local and Romanian populations of R. alba, A1–A4 (1.06), etc. In contrast, the R. centifolia, C1–C2, displayed the lowest Na value of 0.69. A similar pattern was observed for the effective number of alleles (Ne) and Shannon’s information index (I). Similar to Na, these indices showed the highest levels in R. damascena ‘Population 5’, P1–P9 (1.35; 0.29, respectively), and the lowest in Bulgarian cultivars of R. damascena (1.06; 0.05) (Table 5). In addition, the highest expected heterozygosity (He) was detected in ‘Population 5’, P1–P9 (0.20); in contrast, the value was twice as low in Rosa sp., Kl1–Kl8 (0.10), and the lowest in cultivar ‘Svezhen’, SV1–SV2 (0.03).
The highest number of loci (TB = 206), including five SCoT-specific loci, was detected in R. damascena ‘Population 5’, confirming the strong genetic potential of this improved Bulgarian population. Among the remaining species, the total number of bands ranged from 143 in R. centifolia to 188 in R. gallica. A considerable number of species-specific loci were also identified in R. alba (6), followed by R. gallica (4) and R. centifolia (4), whereas only a single species-specific locus was detected in Rosa sp. Across the 15 SCoT markers, only two cultivar-specific loci were identified—one in ‘Raduga’ and one in ‘Yanina’. However, twenty genotype-specific loci were identified that distinguish the cultivars from one another and are also detectable in the genotypes of R. damascena ‘Population 5’ and R. alba (Supplementary Table S1). The percentage of polymorphic loci among the species ranged from 8.51% to 50.64%, with the highest value again recorded for ‘Population 5’, followed by R. gallica, R. alba, Rosa sp., and R. centifolia (Table 5). In the cultivars, the percentage of polymorphism ranged from 8.09% to 11.91%, with the lowest level observed in ‘Svezhen’ and the highest in ‘Yanina’. The cultivar ‘Iskra’ exhibited the lowest total number of loci (TB = 151), whereas the highest number was recorded in ‘Eleina’ (TB = 161) (Table 5). Nei’s genetic distances among the cultivars ranged from 0.0843 to 0.2501, with the greatest distance observed between ‘Yanina’ and ‘Svezhen’ (Supplementary Table S2), reflecting their reduced genetic similarity and the overall limited diversity within the cultivar group. Among all accessions, Nei’s genetic distances varied from 0.0843 to 0.4490 (P7 and A1, Supplementary Table S1).
We also conducted an analysis of molecular variance (AMOVA) to determine how much of the total genetic variation in Rosa accessions is distributed between different levels of the population structure (Table 6). As expected, most genetic variation (61%) was retained within the individuals in each Rosa accession (p < 0.001) compared to the variation among groups (39%) (p < 0.001). These results demonstrated that a higher number of genetic variations are present within the assessed groups compared to among the groups. The calculated ΦPT (PhiPT) value was 0.388, indicating a high level of population differentiation.
3.3. Principal Coordinate Analysis (PCoA) and Genetic Structure
To assess the genetic relationships and population structure among the 38 Rosa accessions, UPGMA clustering, PCoA, and STRUCTURE analysis were applied. This multivariate approach was chosen to complement the results of the cluster analysis. In general, clustering has higher resolution for discriminating closely related populations, whereas the PCoA can provide an informative representation of the overall genetic relationships and distances among major groups.
Overall, the first three axes of the PCoA based on the genetic distances derived from the SCoT-PCR profiles of the fifteen markers explained 30% of the total variance: 14.19%, 8.74%, and 7.14%, respectively. The PCoA biplot (Figure 3) separated the 38 Rosa accessions into four major genetic groups: The first group includes the Rosa sp. accessions (Kl1-Kl8), which are classified as a single subpopulation, indicating that the genetic relationship between this population and the other populations is distant. The second group includes the five accessions of R. gallica (G1–G3, G4, D), as well as ’Raduga’ (R1–R2) and R. centifolia (C1–C2), indicating that their genetic relationship is the closest. Similar to Group I, all individuals of R. alba (A1–A3) in Group 3 formed a single cluster, demonstrating genetic distance from all other groups. Finally, Group 4 comprises all Bulgarian cultivars (E1–E2, I1–I2, SV1–SV2, and Y1–Y2) together with the Bulgarian population of R. damascena, ‘Population 5’ (P1–P9). The third PCoA axis reveals a clear separation of R. centifolia accessions (C1 and C2) from the remaining accessions in the second group, indicating their distant genetic profile compared to other rose species (Supplementary Figure S2).
The subsequent cluster analysis based on Nei’s genetic distances confirmed the grouping obtained from the PCoA analysis, clearly separating R. centifolia and R. alba accessions into distinct clusters, again resolving four major clusters. Cluster 1 comprises R. damascena ‘Population 5’ together with all Bulgarian cultivars. Cluster 2 includes all accessions of R. gallica and all individuals of Rosa sp., arranged into separate subclusters, with the complex hybrid ‘Raduga’ positioned in an intermediate subcluster between them. R. alba and R. centifolia each form a separate, well-defined cluster (Figure 4). Within the R. gallica varietal group, accession G1, originating from IREMC, clusters together with the Romanian representative D, whereas G2 and G3, also germplasm maintained at IREMC, form a subcluster with the second Romanian accession, G4.
The genetic structure analysis was performed, and ∆K was used to determine the optimal number of genetic clusters (K). The highest ∆K value was observed at K = 5 (Figure 5A). These results indicated that the 38 Rosa accessions can be assigned to five major groups based on their Q values, which reflect the proportional membership of each individual to a given ancestral cluster and are represented by different colours (Supplementary Table S3). Cluster Q1 contains the two accessions of R. centifolia (C1–C2). Cluster Q2 includes all four accessions of R. alba (A1–A4). Cluster Q3 comprises all individuals belonging to R. damascena—‘Population’ 5 (P1–P9)—as well as the Bulgarian cultivars ‘Eleina’ (E1–E2), ‘Yanina’ (Y1–Y2), ‘Iskra’ (I1–I2), and ‘Svezhen’ (SV1–SV2). Cluster Q4 includes all accessions of R. gallica: three accessions from IREMC (G1, G2–G3) and two accessions from Romania (G4 and D). Cluster Q5 encompasses all eight accessions of Rosa sp.
The two samples of ‘Raduga’ were classified as admixed genotypes due to their mixed ancestry components (Q value < 0.6). They showed predominant membership in clusters Q3 and Q4, with a minor proportion assigned to Q5 (Figure 5B, Supplementary Table S3).
4. Discussion
Molecular markers were extensively employed for the characterization of germplasm, analysis of genetic diversity, determination of origin, estimation of genetic distances, gene mapping, and marker-assisted selection [52,53,54,55,56,57]. Thus, numerous marker systems such as simple sequence repeats (SSR), randomly amplified polymorphic DNA (RAPD), inter-simple sequence repeats (ISSR), universal rice primers (URP), start codon-targeted primers (SCoT) and cis-element amplified polymorphism (CEAP) have been applied in studies of genetic diversity [15,19,33,58,59] and population structure [2,18,60] within the genus Rosa.
The present study investigates the application of gene-targeted SCoT markers for assessing genetic diversity and analysing genetic relationships among 38 Rosa accessions. The SCoT marker technique is simple, cost-effective, rapid, efficient, and highly reproducible, requiring only a small amount of DNA and no prior knowledge of DNA sequence information [32]. These markers were designed based on the ATG context, a conserved region flanking the translation initiation codon; consequently, SCoT markers are associated with functional genes and their correlated traits [32]. In comparison with RAPD, AFLP, and ISSR marker systems, SCoT represents a gene-targeted, multilocus approach capable of generating more biologically meaningful information and is particularly effective for detecting high levels of genetic polymorphism [55,61,62].
4.1. Efficiency of SCoT Markers and Overall Genetic Diversity
The fifteen SCoT markers employed in this study showed high informativeness and discriminatory power, as indicated by the mean values of polymorphic information content (PIC = 0.52), effective multiplex ratio (EMR = 18.5), marker index (MI = 9.4), and resolving power (RP = 19.2). These values revealed substantial genetic diversity among the 38 Rosa accessions (I = 0.40, He = 0.26) and a high level of polymorphism (89%). These findings demonstrate that, despite the traditionally assumed narrow genetic base of oil-bearing roses [14,15,63], gene-targeted markers can uncover considerable hidden variation. Previous studies have similarly demonstrated the effectiveness of SCoT markers, revealing high levels of genetic polymorphism in rose cultivars, and have highlighted their value for germplasm management, propagation strategies, and the conservation of genetic resources [40,60,64]. They have been used to detect interspecific variation [33], distinguish different cultivars [40,64], and identify populations [2,18,19,60]. The assessment of genetic diversity with SCoT markers in roses enables effective germplasm classification, guiding the selection of diverse parental lines for breeding programs [33].
4.2. Genetic Resources of Rosa damascena and Their Significance
The opportunities to improve industrial R. damascena cultivars are largely restricted to clonal selection, as maintaining the traditional aroma, key phenotypic traits, and the characteristic composition of rose oil is essential. Consequently, global rose oil production depends on one or only a few closely related genotypes [14]. However, molecular technologies can substantially accelerate the selection process compared to conventional approaches. They can also provide a means to meet the ongoing demand for novel cultivars by facilitating the identification of suitable parents or populations with desirable traits within the existing genetic resources of the Damask rose [14].
In the present study, we analyzed the genetic potential of 17 R. damascena accessions (nine from ‘Population 5’ and eight representing four cultivars), all maintained in the scientific experimental collection of IREMC. This germplasm represents the foundation of modern rose cultivation in Bulgaria, yet it remains insufficiently characterized at the molecular level. Earlier studies based on SSR markers reported a genetic uniformity among 24 accessions of clonal lines and cultivars of R. damascena from the IREMC collection [15]. A more recent study [19] using ISSR markers confirmed the limited genetic potential for selection within the Bulgarian R. damascena cultivars (I = 0.16, He = 0.10). However, the same study also revealed substantial genetic variability in the widely cultivated population of the Kazanlik rose (‘Population 5’), where a high polymorphism was detected among 12 accessions (I = 0.36, He = 0.24) [19]. In line with these earlier findings, the present study corroborates this trend by revealing a higher genetic diversity within the nine accessions from the IREMC ‘Population 5’ compared to the eight accessions representing the Bulgarian cultivars (I = 0.22, He = 0.15, Nei’s genetic distance = 0.0843–0.2501). The markedly higher genetic diversity (Ne = 1.35, I = 0.29, He = 0.20) detected within the nine accessions of the locally improved R. damascena ‘Population 5’ reflects the traits-based selection approach used for its development. Four distinct clones (‘Svezhen 188’, ‘Svezhen 189’, ‘Svezhen 191’ and ‘Svezhen 190’) were combined in the propagation of this population [38]. These results further highlight the importance of ‘Population 5’ as a key reservoir of genetic variation that can support future breeding and selection programs.
The genetic polymorphism among Bulgarian R. damascena accessions has also been reported in a study of 16 populations from Greece, Turkey, France, and Bulgaria [18]. In that study, three Bulgarian accessions originating from production fields were analyzed; one accession clustered with genotypes from Turkey and Greece, whereas the other two formed distinct clusters and exhibited greater genetic similarity to French genotypes [18]. The genetic diversity in R. damascena has also been documented in several regions worldwide, including Morocco (36 accessions evaluated using 13 ISSR markers [4]), India (29 accessions analyzed using 36 SCoT markers [40]), and Iran (40 accessions from five regions assessed with 12 SCoT primers and 14 URP primers [2]). Additionally, Chtourou [3] highlights the species’ strong adaptive capacity, noting that epigenetic mechanisms contribute to its adjustment to local factors such as climate and cultivation practices, resulting in both phenotypic and genetic modifications [3].
Among the four Bulgarian cultivars examined, the percentage of polymorphic loci ranged from 8.09% to 11.91%, with the lowest value observed in the clonally selected cultivar ‘Svezhen’ and the highest in the mutant-derived ‘Yanina’. Nei’s genetic distances among cultivars varied from 0.0843 to 0.2501, with the greatest divergence detected between ‘Yanina’ and ‘Svezhen’, reflecting their distinct breeding origins. The SCoT markers additionally identified 20 cultivar-specific loci that require further investigation. These loci may relate to specific agronomic traits or serve for cultivar identification and authentication, as shown in studies for other plant species [65,66,67,68]. Overall, our findings provide an evidence-based framework for developing marker-based selection strategies to improve the essential oil productivity in Bulgarian rose cultivars.
4.3. Genetic Position of the Unidentified Bulgarian Rose Genotype
Another major contribution of the present study is the molecular characterization of the unknown Bulgarian rose genotype (Rosa sp.) found in an old rose plantation in the nearby town of Klisura. All analytical approaches applied (PCoA, UPGMA, and STRUCTURE) consistently placed this genotype in a distinct genetic group, clearly separated from all studied R. damascena accessions. In the STRUCTURE analysis, the Rosa sp. genotype showed no evidence of gene flow from the genotypes included in the study. In the cluster analysis, however, Rosa sp. formed a separate sub-cluster most closely related to Rosa ‘Raduga’ and all R. gallica accessions, suggesting a closer relationship with R. gallica and a possible complex hybrid origin involving this species. All eight analyzed Rosa sp. samples exhibited low genetic diversity (Ne = 1.17, I = 0.15, He = 0.10), supporting the information provided by the owner that the plantation in Klisura originated from vegetative propagation of a single mother plant (personal communication). The low intra-group variability, along with the presence of only one genotype-specific locus, indicates a high degree of genetic uniformity, likely resulting from this vegetative propagation. Further studies using additional molecular, morphological and phytochemical analyses are needed to better clarify its taxonomic status, essential oil quality characteristics, and potential economic value. Additional rose species not included in the present study will contribute to analyzing the genetic relationships of Rosa sp. with other oil-bearing roses.
4.4. Genetic Relationships Among Oil-Bearing Rosa Species
Our interspecific analyses clearly discriminated the major taxa included in the study. All R. damascena accessions (including ‘Population 5’ and the cultivars) formed a stable and well-defined group distinct from R. gallica, R. alba, and R. centifolia. The R. gallica accessions (three from Bulgaria and two from Romania) clustered into a separate, internally structured group. This pattern indicates a high degree of genetic relatedness regardless of geographic origin and supports the hypothesis of long-term cultivation and exchange of this species within the region. It is widely considered that in roses, a primary factor contributing to the lack of strong correlation between genetic and geographic divergence is human intervention [69]. Despite this, the groups analyzed in the present study exhibited greater divergence at the intrargroup level (61%) than at the intergroup level (39%), a pattern also reported in other studies on roses [2,5]. Rosa alba and R. centifolia were resolved as distinct units, while the hybrid cultivar ‘Raduga’ exhibited an intermediate, admixed profile, confirming the sensitivity of the SCoT markers in detecting its complex hybrid origins. The high genetic differentiation observed in our study (ΦPT = 0.388) is consistent with the hybrid origin and complex polyploid nature of these species. Similarly to R. damascena, a previous study using SSR markers and genetic materials from the IREMC experimental collection reported that R. alba production largely relies on a single genotype [63].
A recent SSR-based study of R. gallica accessions identified key cultivars from the R. centifolia, R. alba, and Damask groups as interspecific hybrids of R. gallica, but could not identify the other species involved in these hybridizations. The authors recommended using a specific set of markers to resolve intra-generic relationships in such cultivars to uncover the origins of R. centifolia, R. alba, and Damask roses [70]. Our SCoT-based study further separated these three groups in population structure and cluster analyses. However, the first two axes of the PCoA analysis (Figure 3) indicated similarity of R. centifolia accessions to R. gallica and the cultivar ‘Raduga’, which is known to be a complex hybrid of R. gallica. This was clearly reflected in the STRUCTURE analysis, showing the mixed origin for ‘Raduga’ (Q value < 0.6) and the resolution of the primers used. All analytical methods applied (Figure 3, Figure 4 and Figure 5 and Figure S2) successfully distinguished both R. centifolia accessions as separate genotypes. Overall, the AMOVA results revealed a higher proportion of the genetic variation within groups than among groups (61%), thus underlining the importance of individual-level diversity and intra-population variation for effective genetic resource management. This is particularly relevant for oil-bearing roses, where the genetic base is limited and selection pressure is high.
4.5. Genetic Distinctiveness, Possible Historical Origin, and Conservation Significance of the Studied Romanian Roses
The Romanian roses included in this study, two R. gallica and one R. alba accession from southern Romania, have likely been traditionally cultivated for household uses such as jam and herbal tea preparation. Clarifying their genetic relationships with modern cultivars, while conserving these genotypes, is an important aspect of rose research. Both species show high resistance to adverse environmental conditions and low cultivation requirements [71], characteristics confirmed in current studies by the strong vegetative propagation capacity of R. alba from Cărbunești (A4) and R. gallica from Poeni (G4).
Field observations further indicate that several rose plants from Cărbunești occur in multiple household gardens within the same locality, suggesting long-term vegetative propagation from closely related source plants. This pattern, together with their genetic distinctiveness and morphological characteristics, indicates that these accessions are unlikely to represent modern rose cultivars. The R. gallica accession from Poeni (G4) exhibits phenotypic features comparable to those described for historical cultivars with a gallica background, such as Marbled Rose (Rosa marmorea, sensu historical literature), which is currently reported mainly from botanical collections. Although direct genetic comparison with authenticated R. marmorea material is not yet available, the observed similarities point to a possible relationship with ancient rose lineages described in pre-modern horticultural literature.
It is known that R. alba thrives in cooler climates and on poor soils, suggesting favourable prospects for oil production. Therefore, it is suitable for cultivation in marginal or northern areas where it could support organic production and provide economic opportunities for disadvantaged communities [71]. Other benefits from rose cultivation could include landscape stabilization in erosion- and landslide-prone regions [71,72]. Given their cold tolerance, general hardiness, disease resistance, low maintenance needs, and self-supporting growth habit—especially in R. gallica—both species are well suited to low-input, sustainable agricultural systems, supporting their inclusion in genetic diversity assessments and future studies on essential oil production under diverse environmental conditions [73].
5. Conclusions
The present study demonstrates that start codon-targeted (SCoT) markers are powerful and reliable tools for discriminating among oil-bearing rose species, cultivars, and individual genotypes. The high level of polymorphism detected confirms their suitability for genetic characterization and population structure analysis within the genus of Rosa.
The locally improved R. damascena ‘Population 5’ exhibited the highest genetic diversity among the studied materials, demonstrating its importance as a key genetic source for future breeding and selection programs. In contrast, the Bulgarian R. damascena cultivars showed comparatively lower diversity, reflecting their clonal origin. The cultivar-specific SCoT loci identified in this study should be further examined for potential associations with agronomic traits and, in combination with the genetic distance estimates, may assist in informed parental selection.
The unidentified Bulgarian genotype (Rosa sp.) was clearly differentiated from R. damascena accessions, showing greater genetic similarity to R. gallica and the cultivar ‘Raduga’, suggesting a closer evolutionary relationship with R. gallica.
The unknown Rosa sp. from Bulgaria, along with the Romanian R. gallica and R. alba accessions, represents genetically distinct traditional germplasm, likely maintained through long-term vegetative propagation, and constitutes valuable resources for diversity conservation and sustainable rose breeding. Overall, this study extends our understanding of the genetic diversity and structure of oil-bearing roses and adds valuable information on the relationships among commonly used Rosa genotypes. The data obtained can support future breeding and conservation efforts aimed at enhancing productivity, adaptability, and sustainable management of this economically and culturally significant crop.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Smulders M.J. Arens P. Bourke P.M. Debener T. Linde M. De Riek J. Leus L. Ruttink T. Baudino S. Hibrant Saint-Oyant L. In the name of the rose: A roadmap for rose research in the genome era Hortic. Res.201966510.1038/s 41438-019-0156-031069087 PMC 6499834 · doi ↗ · pubmed ↗
- 2Mostafavi A.S. Omidi M. Azizinezhad R. Etminan A. Badi H.N. Genetic diversity analysis in a mini core collection of Damask rose (Rosa damascena Mill.) germplasm from Iran using URP and S Co T markers J. Genet. Eng. Biotechnol.20211914410.1186/s 43141-021-00247-734591207 PMC 8484433 · doi ↗ · pubmed ↗
- 3Chtourou K. Salazar J.A. Ortuño-Hernández G. Mezghani N. Trifi-Farah N. Martínez-Gómez P. Krichen L. Genetic diversity and relationships among Tunisian wild and cultivated Rosa L. species Plants 202413356310.3390/plants 1324356339771261 PMC 11678506 · doi ↗ · pubmed ↗
- 4Lebkiri N. Abbas Y. Iraqi D. Gaboun F. Saghir K. Fokar M. Diria G. Morphological characterization and genetic diversity of a mini core collection of Rosa damascena from Morocco J. Genet. Eng. Biotechnol.20242210042310.1016/j.jgeb.2024.10042339674641 PMC 11462007 · doi ↗ · pubmed ↗
- 5Saghir K. Abdelwahd R. Iraqi D. Lebkiri N. Gaboun F. El Goumi Y. Diria G. Assessment of genetic diversity among wild roses in Morocco using ISSR and DAMD markers J. Genet. Eng. Biotechnol.20222015010.1186/s 43141-022-00425-136318391 PMC 9626730 · doi ↗ · pubmed ↗
- 6Koopman W.J. Wissemann V. De Cock K. Van Huylenbroeck J. De Riek J. Sabatino G.J. Smulders M.J. AFLP markers as a tool to reconstruct complex relationships: A case study in Rosa (Rosaceae)Am. J. Bot.20089535336610.3732/ajb.95.3.35321632360 · doi ↗ · pubmed ↗
- 7Harmon D.D. Chen H. Byrne D. Liu W. Ranney T.G. Cytogenetics, ploidy, and genome sizes of rose (Rosa spp.) cultivars and breeding lines Ornam. Plant Res.202331010.48130/OPR-2023-0010 · doi ↗
- 8Babaei A. Tabaei-Aghdaei S.R. Khosh-Khui M. Moradi H. Naghavi M.R. Kalantar E. Microsatellite analysis of Damask rose (Rosa damascena Mill.) accessions from various regions in Iran reveals multiple genotypes BMC Plant Biol.200771210.1186/1471-2229-7-1217346330 PMC 1832195 · doi ↗ · pubmed ↗
