# Identification and Development of Pathogen- and Pest-Specific Defense–Resistance-Associated SSR Marker Candidates Assisted by Machine Learning and Discovery of Putative QTL Hotspots in Camellia sinensis

**Authors:** Ayşenur Eminoğlu

PMC · DOI: 10.3390/plants15030454 · Plants · 2026-02-02

## TL;DR

This study develops SSR markers linked to disease and pest resistance in tea plants using machine learning and identifies potential QTL hotspots.

## Contribution

A novel, biologically relevant SSR marker resource for defense–resistance traits in Camellia sinensis is developed using machine learning and QTL hotspot analysis.

## Key findings

- 5197 SSR loci were identified in defense-related gene regions of Camellia sinensis.
- 633 SSRs were filtered as putative QTL hotspots at high significance thresholds.
- 386 SSR primers were designed, with 124 showing high polymorphic information content.

## Abstract

In this study, a targeted SSR (Simple Sequence Repeat) marker resource was developed based on genes and protein families associated with pathogen- and pest-related defense–resistance mechanisms in Camellia sinensis. Forty-one genes and protein families reported to show upregulation, increased expression, or functional validation under disease and pest stress were selected, and the corresponding 195 loci were mapped onto the Camellia sinensis cv. Shuchazao genome. SSR screening within gene bodies and gene-flanking regions (±5 kb) identified 5197 SSR loci. Putative QTL hotspot regions were defined using locus-based sliding-window analysis, Z-score calculations, and permutation tests, yielding 633 SSRs filtered at the 99% and 95% significance thresholds. Proteome-wide scans based on conserved amino acid motifs identified multiple loci within the WRKY, NAC, LRR, PRX, and CHI families, and Random Forest analysis was used to prioritize SSRs within these families. Finally, 386 SSR primer sets were designed and evaluated by in silico PCR across six tea genomes. Of these, 245 primers produced amplicons in more than one genome, and 124 exhibited polymorphic information content values greater than 0.500. Overall, the developed SSR panels represent a biologically contextualized and experimentally transferable marker resource targeting defense–resistance-associated genic and gene-proximal regions.

## Linked entities

- **Genes:** WRKY (probable WRKY transcription factor 33) [NCBI Gene 103865671], XK (X-linked Kx blood group antigen, Kell and VPS13A binding protein) [NCBI Gene 7504], LRR (Leucine-rich repeat) [NCBI Gene 35715], PRX (periaxin) [NCBI Gene 57716], Chi (Chip) [NCBI Gene 37837]
- **Species:** Camellia sinensis (taxon 4442)

## Full-text entities

- **Species:** Camellia sinensis (black tea, species) [taxon 4442]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12899447/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12899447/full.md

## References

74 references — full list in the complete paper: https://tomesphere.com/paper/PMC12899447/full.md

---
Source: https://tomesphere.com/paper/PMC12899447