Development of Simple Sequence Repeat (SSR) Markers from a Genome Survey of a Cymbidium kanran Makino Population in Jeju Island, Republic of Korea
Kyeoung Cheol Kim, Seungtae Kang, Su-Lim Kim, Rambukkana Maggonage Thiruni Dananjana Perera, Jin Kyu Woo, Kumarasinghe Hiruni Sandunika, Ji-Hyang Kim, Dong-Sun Lee

TL;DR
Researchers developed genetic markers to study and protect a rare orchid species on Jeju Island.
Contribution
New SSR markers were developed to distinguish Jeju's Cymbidium kanran from other populations.
Findings
86 SSR marker candidates were identified from genome sequencing.
25 polymorphic SSR markers were selected for their usefulness in conservation and identification.
The markers will help in conservation and cultivar identification of C. kanran.
Abstract
The Cymbidium kanran Makino, an economically significant ornamental plant, is observed in small numbers in its natural habitat on Jeju Island in South Korea. C. kanran of Jeju is afforded protection due to a decline in its population resulting from environmental changes and illegal poaching. We developed simple sequence repeat (SSR) markers to analyze the differences to other C. kanran through molecular genetic studies. Based on the results of the Random amplified polymorphic DNA (RAPD) analysis and whole genome sequencing, 86 initial SSR marker candidates were selected in silico. After excluding those that were structurally unsuitable, 40 were reselected through polymorphism testing. Finally, 25 markers were selected based on the diversity test and their applicability to real samples. The newly developed markers will prove invaluable in substantiating the distinctiveness of C. kanran…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAgriculture, Soil, Plant Science · Ecology and Conservation Studies · Food Quality and Safety Studies
Introduction
Cymbidium is an important horticultural plant genus with high economic and ornamental value [1]. Therefore, characterizing these species is vital for the management, conservation, and understanding of their genetic relationships [2]. In the Republic of Korea, 92 species, 11 varieties, and 7 cultivars of Orchidaceae plants grow naturally. Among these, Cymbidium species such as Cymbidium kanran, C. koran, C. lancifolium, C. javanicum var. as pidistrifolium, C. nipponicum, and C. goeringii are distributed on the southern slopes of Hallasan Mountain on Jeju Island. C. kanran was first described in 1900 by the Japanese botanist Kino. It is found in temperate southern climates across several regions, including Jeju Island in the Republic of Korea, Southern Japan, Taiwan, southern China, and Yunnan Province. The natural habitat of C. kanran on Jeju Island has been destroyed by the development of pastures and citrus orchards, and the species faces extinction owing to illegal poaching [3].
The Jeju Sanghyo-dong C. kanran habitat (natural monument no. 432) is the only location in the Republic of Korea where the flowering of C. kanran has been observed [4]. Therefore, it is necessary to adopt immediate measures to conserve these species. Before developing conservation measures for any species, it is crucial to understand its genetic diversity and organization, which requires effective marker resources. In the past, Cymbidium biodiversity has been evaluated based on morphological and physiological traits; however, this approach has constraints because environmental factors influence these characteristics [2].
Incorporating the ploidy, chromosome number, and reproductive system of C. kanran would indeed enhance the understanding of its biological characteristics. C. kanran generally exhibits a diploid chromosome number of 40. Natural self-pollination (autogamy) is absent in this species, meaning it requires external pollination to produce seeds. However, the natural fruit set rate is significantly lower than that achieved through artificial pollination, indicating a substantial restriction in natural pollination. This species primarily relies on insect pollination for reproduction. The unique floral features of C. kanran, including the specialized labellum, pollinia, and gynandrium, evolved through complex interactions with pollinators. While many orchids use nectar to attract pollinators, about one-third of species, including C. kanran, employ deceptive strategies such as sexual and food deception. A study by Japanese researchers demonstrated that the unique floral features of C. kanran including its specialized labellum, pollinia, and gynandrium evolved through complex interactions with pollinators. The species produces specific volatile compounds to attract Apis cerana japonica for pollination, despite not offering a food reward [5, 6].
Molecular markers are increasingly used to understand population genetic structure, parentage, population viability, gene flow, and genetic diversity, as well as to study synthetic pathways, evolution, the effects of habitat fragmentation, and to guide conservation strategies [7, 8]. Molecular markers used to analyze genetic polymorphisms are categorized into protein- and DNA-based markers. While early markers such as allozymes were simple and neutral, they were replaced by DNA-based markers owing to low polymorphism and environmental variation. Among DNA markers, restriction fragment length polymorphism (RFLP) and variable number tandem repeat (VNTR) have advantages such as high transferability but are costly and labor intensive. PCR-based markers, such as random amplified polymorphic DNA (RAPD), simple sequence repeat (SSR), and single nucleotide polymorphisms (SNPs), are more commonly used. RAPD does not require sequence data but is a dominant marker, whereas SSR and SNPs require sequence data, are co-dominant, and show high polymorphisms [9, 10]. Among PCR-based markers, Simple Sequence Repeat (SSR) markers have become widely adopted due to their high polymorphism, codominant inheritance, and extensive applicability in genetic studies. SSRs are short, tandemly repeated DNA sequences distributed throughout the genome. Their high variability, stemming from differences in repeat numbers, makes them invaluable tools for assessing genetic diversity, population structure, and gene flow. The codominant nature of SSRs allows for the detection of both alleles in heterozygous individuals, offering more comprehensive genetic insights compared to dominant markers such as RAPD. Additionally, SSR markers are highly reproducible, capable of detecting even subtle genetic differences, and are found across a wide range of species.
This study focused on developing SSR markers specific to C. kanran as a crucial step toward understanding its genetic diversity and population dynamics, providing powerful tools to support conservation efforts and secure the long-term survival of this endangered orchid species.
Materials and Methods
C. kanran Samples
A total of 20 C. kanran leaf samples were obtained from Jeju Special Self-Governing Province World Heritage Headquarters. Details are provided in the following Table 6 and Fig. S1. Genomic DNA (gDNA) was extracted from the provided samples using the Bio-medic Plant gDNA Extraction Kit (www.ibiomedic.co.kr). The extracted gDNA was quantified using a DS-11 Spectrophotometer (DeNovixDS, USA, in Bio-Health Materials Core-Facility, Jeju National University) and verified by 1% (w/v) agarose gel electrophoresis.
Genomic DNA Extraction
As part of the preparatory process for whole-genome sequencing (WGS) of C. kanran reference samples, RAPD PCR was performed using gDNA extracted from 16 provided samples as templates. Allelic data obtained from RAPD PCR were used to construct UPGMA and Neighbor-Joining (NJ) dendrograms for genetic relationship analysis.
RAPD Analysis
Whole-genome sequencing was performed on two C. kanran samples (JCK-01, JCK-07). gDNA was extracted from both samples and paired-end whole-genome sequencing was conducted using the Illumina NovaSeq6000 platform. Given the estimated genome size of Cymbidium species, approximately 60 Gb of raw sequencing data were generated for each sample. Libraries for whole-genome sequencing were prepared using the Illumina TruSeq Nano DNA Library Kit. The UPGMA phylogenetic tree was constructed using MEGA software (version 7.0; [11]).
Whole Genome Sequencing of C. kanran Samples
PCR was performed using HSTM Taq PCR polymerase (Dongsheng Biotech, China) and ABI 2720 thermal cycler (Applied Biosystems, USA). PCR reaction condition and amplication condition are shown following Table 7.
DNA Fragment Analysis
PCR amplification products was diluted to 1/50 using D.W. Diluted sample 1 ul was added to 9 ul Hi-Di Formamide (Applied biosustems): GeneScan 500 LIZ size standard (Applied Biosystems) (99:1) mixture and it was denaturated at 95°C for 2 min. After cooling down on ice, Fragment analysis was performed with 3730 DNA Analyzer equipped with 50 cm capillary. DNA fragment size was analyzed using GeneMapper software (ver.4.0; Applied Biosystem). Genetic parameters, including major allele frequency (MAF), number of alleles (NA), genetic diversity (GD), and polymorphism information content (PIC), were measured by calculating shared allele frequencies using PowerMarker software (version 3.25; [12]).
Results
Identification of SSR Candidate Markers through Whole-Genome Sequencing and Comparative Genomic Analysis
To perform whole-genome sequencing (WGS), RAPD PCR (Random Amplified Poly-morphic DNA Polymerase Chain Reaction) was conducted on 16 C. kanran samples. Genetic relationships among the samples were assessed by constructing dendrograms using the Neighbor-Joining (NJ) method (Fig. 1). The dendrogram revealed that four native habitat samples (JCK-02, JCK-03, JCK-04, and JCK-05) were genetically identical or highly similar, indicating low genetic variation among them. In contrast, JCK-01 and JCK-07 exhibited greater genetic divergence and were therefore selected for whole-genome sequencing.
A review of available genome data for the genus Cymbidium showed that no genomic information was available for C. kanran, while the average genome size of four closely related species was approximately 4 Gb (Table 1). WGS was designed to achieve 15x genome coverage for C. kanran. Paired-end sequencing generated 73.8 Gb and 75.4 Gb of raw data for JCK-01 and JCK-07, respectively, yielding approximately 500 million reads for JCK-01 and 490 million reads for JCK-07. The GC content was 33.17% and 32.99% for JCK-01 and JCK-07, respectively (Table S1).
The raw sequencing data were preprocessed and trimmed, followed by de novo genome assembly to construct contigs. Microsatellite-containing contigs were identified, and multiple sequence alignment was performed using the CLC Genomics Workbench. As a result, 86 candidates for polymorphic SSR markers were identified through in silico analysis.
Polymorphism of Identified SSR Candidate Markers
Among the 86 identified polymorphic SSR candidate markers, 59 primers were designed after excluding those prone to hairpin, self-dimer, or hetero-dimer formation. A complete list of the synthesized primers is provided in Table 2. Six C. kanran samples (JCK-01, JCK-06, JCK-07, JCK-09, JCK-10, and JCK-15) were analyzed using the 59 SSR markers. After applying selection criteria based on polymorphism, PCR amplification clarity, and efficiency, 40 markers were retained (Fig. 2 and Table 3).
Genetic Diversity Analysis Using Discovered SSR Markers
Genetic diversity analysis of the initial 16 C. kanran samples, using 40 SSR markers, revealed substantial genetic variation among most samples. The UPGMA dendrogram (Fig. 3) indicated clear genetic separation; however, some samples displayed identical allele patterns. Specifically, JCK-02, JCK-03, JCK-04, and JCK-05 shared identical alleles across all 40 loci, suggesting high genetic similarity or possible redundancy among these individuals.
A detailed summary of the genetic characteristics of the 40 SSR loci is provided in Table 4. This table includes key metrics such as Major Allele Frequency (MAF), representing the frequency of the most common allele at each locus; Number of Alleles (NA), indicating observed allelic diversity; and Polymorphism Information Content (PIC), reflecting the informativeness of each marker for genetic diversity assessment.
The genetic variability at each SSR locus was quantified based on allele count, heterozygosity, gene diversity, and PIC. An average of 6.2 alleles per locus was detected among the 246 identified alleles. The MAF ranged from 0.16 to 0.97, while the number of detectable alleles (NA) per SSR marker varied between 2 and 15. The PIC values ranged from 0.06 to 0.90, indicating a wide range of polymorphism levels across the loci. These results demonstrate the potential of the selected SSR markers for assessing genetic diversity in C. kanran.
Final Selection of High-Quality SSR Markers and Genetic Diversity Analysis
Following a comprehensive evaluation of the 40 initially selected SSR markers, a final subset of 25 high quality markers was chosen based on key selection criteria. These criteria included functionality across multiple samples, detection of two or fewer alleles per plant, and the absence of amplification artifacts (Table 3; markers indicated in red).
To enhance the reliability of the genetic diversity analysis, four additional samples were included, resulting in a total of 20 C. kanran samples being analyzed using the 25 selected SSR markers. A new UPGMA dendrogram was constructed based on allele-sharing patterns among the samples (Fig. 4), revealing broader genetic differentiation. This updated dendrogram provided more comprehensive insights into the genetic diversity and population structure of C. kanran.
The genetic characteristics of the 25 SSR loci applied to the 20 samples are summarized in Table 5. Key metrics, including allele frequency, number of alleles per locus, and Polymorphism Information Content (PIC), were evaluated, confirming the effectiveness of the selected markers in detecting genetic variation. These results further support the applicability of the identified markers for future population genetics and conservation studies of C. kanran.
Discussion
This study aimed to explore the genetic diversity of C. kanran and develop polymorphic SSR markers through whole-genome sequencing and comparative genomic analysis [8, 17]. As C. kanran is currently classified as an endangered species in Korea, the development of a specific marker for C. kanran native to Jeju Island is imperative. The findings provide important insights into the genetic structure, marker development, and potential applications for conservation and breeding programs. Despite the research conducted on C. kanran's markers, these studies have employed outdated technology, and the extant literature is rather limited. The first reported marker was the ERAPD marker, so its specificity and reproducibility were lower than those of currently used common markers, and most of the other markers were studied using Chinese native species [18?-20]. The development of marker candidates for the genetic analysis of individuals adapted to Jeju environmental conditions was based on the most recent technologies, such as NGS and in silico methods. As a result, fourteen novel SSR markers were previously identified in Cymbidium spp. and successfully employed to measure the genetic diversity and relationships within a Cymbidium collection [1]. Moreover, the fourteen identified SSR markers were successfully applied in a genetic diversity study of Cymbidium species in the Republic of Korea, revealing that C. goeringii is more abundant and widespread in Korea compared to C. sinensis [21, 22].
The initial RAPD PCR analysis and dendrogram construction revealed that four native habitat samples (JCK-02, JCK-03, JCK-04, and JCK-05) were genetically identical or highly similar, indicating limited genetic variation among these individuals. This lack of diversity may result from habitat fragmentation, clonal propagation, or restricted gene flow. Similar patterns have been reported in other orchid species under environmental stress or isolated conditions. These findings highlight the necessity of conservation strategies aimed at preserving or increasing genetic diversity within native populations. Whole-genome sequencing of two genetically distinct samples, JCK-01 and JCK-07, produced high-quality genomic data with approximately 15x genome coverage, confirming the feasibility of genome-wide SSR marker discovery. The GC content (≤ 33%) was consistent with values reported in related orchid species, validating the sequencing results. A total of 86 candidate polymorphic SSR markers were identified through in silico analysis, demonstrating the utility of high-throughput sequencing in marker development for non-model plant species. A systematic screening process resulted in the selection of 40 polymorphic SSR markers after removing loci with low amplification efficiency or limited polymorphism. These markers were tested on 16 samples, revealing an average of 6.2 alleles per locus, with PIC values ranging from 0.06 to 0.90 [23]. The observed allele diversity confirmed the effectiveness of these markers in capturing genetic variation [24, 25]. Further evaluation led to the final selection of 25 high-quality SSR markers based on key criteria such as consistent amplification, the detection of two or fewer alleles per plant, and the absence of amplification artifacts. To enhance the reliability of the genetic diversity analysis, four additional samples were included, bringing the total to 20 analyzed samples. A new UPGMA dendrogram constructed using these samples provided a clearer view of the population structure and genetic relationships. Broader genetic differentiation was observed, suggesting a more comprehensive representation of C. kanran's genetic diversity. The present study was conducted with a very limited number of 20 samples due to the restrictions imposed by the Korean collecting regulations. This was sufficient for our goal of identifying C. kanran of Jeju Island with higher accuracy, rather than the entire C. kanran species. However, the utilization of the established markers for the comprehensive identification of C. kanran species may prove to be a more efficacious approach than the markers that are currently presented.
The identification of genetically similar individuals within native habitats underscores the urgent need for conservation measures to preserve genetic diversity. Strategies such as habitat restoration, gene flow enhancement, and introducing genetically diverse individuals could strengthen population resilience. Additionally, the developed markers have several applications, including cultivation identity/assessment of purity, assessment of genetic diversity and parental selection, identification of genomic regions under selection, and marker-assisted backcrossing [26?-28]. This novel set of SSR markers makes the breeding process more precise and efficient, ultimately contributing to the preservation, breeding and genetic monitoring of C. kanran. Overall, this study successfully developed and validated a set of robust SSR markers for C. kanran, providing a foundation for future research in population genetics, conservation, and breeding. The combination of genome-wide sequencing, marker selection, and expanded genetic analysis highlights the power of molecular tools in advancing conservation genomics for rare and ecologically significant plant species.
Supplemental Materials
Supplementary data for this paper are available on-line only at http://jmb.or.kr.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Balilashaki K Martinez-Montero ME Vahedi M Cardoso JC Silva Agurto CL Leiva-Mora M 2023 Medicinal use, flower trade, preservation and mass propagation techniques of Cymbidium orchids-an overview Horticulturae 969010.3390/horticulturae 9060690 · doi ↗
- 2Moe KT Zhao W Song HS Kim YH Chung JW Cho YI 2010 Development of SSR markers to study diversity in the genus Cymbidium Biochem. Syst. Ecol.3858559410.1016/j.bse.2010.07.004 · doi ↗
- 3Hyun HJ Kim HR Choi HS Kim CS 2014 Distribution and vegetation structure of genus Cymbidium (Orchidaceae) in Jeju Island Korean J. Agri. For. Meteorol.1611010.5532/KJAFM.2014.16.1.1 · doi ↗
- 4Lee JS 2004 Habitat characteristics and distribution of Cymbidium kanran native to Jejudo, Korea J. Korean Env. Res. Reveg. Tech.74049
- 5Zeng RZ Zhu J Xu SY Du GH Guo HR Chen J 2020 Unreduced male gamete formation in Cymbidium and its use for developing sexual polyploid cultivars Front. Plant Sci.1155810.3389/fpls.2020.0055832499802 PMC 7243674 · doi ↗ · pubmed ↗
- 6Luo H Xiao H Wu X Liu N Chen X Xiong D 2024 Cymbidium kanran can deceptively attract Apis cerana for free pollination by releasing specialized volatile compounds Nat. Conserv.568310010.3897/natureconservation.56.126919 · doi ↗
- 7Balilashaki K Martinez-Montero ME Vahedi M Cardoso JC Silva Agurto CL Leiva-Mora M 2023 Medicinal use, flower trade, preservation and mass propagation techniques of Cymbidium orchids-an overview Horticulturae 969010.3390/horticulturae 9060690 · doi ↗
- 8Huang Y Li F Chen K 2010 Analysis of diversity and relationships among Chinese orchid cultivars using EST-SSR markers Biochem. Syst. Ecol.389310210.1016/j.bse.2009.12.018 · doi ↗
