# Comparative analysis of RAD-seq methods for SNP discovery and genetic diversity assessment in oil seed crop safflower

**Authors:** Pooja Pathania, Gaddam Prasanna Kumar, Nishu Gupta, R. Parimalan, J. Radhamani, Rajesh Kumar, Sunil Shriram Gomashe, Palchamy Kadirvel, S. Rajkumar

PMC · DOI: 10.1038/s41598-025-06706-2 · Scientific Reports · 2025-07-02

## TL;DR

This study compares RAD-seq methods for SNP discovery in safflower, finding that ddRAD-seq with EcoRI_Msel is the most effective for genetic analysis.

## Contribution

The study identifies the optimal RAD-seq method and enzyme combination for SNP genotyping in safflower.

## Key findings

- ddRAD-seq outperformed sdRAD-seq in read count, alignment, and SNP detection.
- EcoRI_Msel captured more SNPs with fewer missing data compared to other enzyme combinations.
- ddRAD-seq explained a significant portion of genetic variation in safflower accessions.

## Abstract

Safflower (Carthamus tinctorius L.) is an important oilseed crop with diverse uses and the potential for genetic improvement. This study aimed to optimize genotyping-by-sequencing (GBS) for safflower via in silico and in vitro methods with two restriction site-associated DNA sequencing (RAD-seq) approaches, i.e., single restriction site-associated DNA sequencing (sdRAD-seq) and double-digest RAD sequencing (ddRAD-seq) and three restriction enzyme combinations (ApeKI, NlaIII_Msel, and EcoRI_Msel). Forty-two safflower accessions were selected for this study. In silico testing revealed that NlaIII_Msel generated the largest number of DNA fragments, followed by ApeKI and EcoRI_Msel. The in vitro results showed that ddRAD-seq outperformed sdRAD-seq in terms of raw read count, alignment rate, depth and breadth of coverage, and SNP detection. An alignment-free analysis using k-mer counting and sketching based on genetic distance further confirmed the superiority of ddRAD-seq. Gene-level k-mer validation identified more core genes in the ddRAD-seq data. Variant calling resulted in 6,721, 173,212, and 221,805 single nucleotide polymorphic sites (SNPs) for ApeKI, NlaIII_Msel, and EcoRI_Msel, respectively. SNP annotation and distribution analysis revealed that EcoRI_Msel captured more SNPs with fewer missing observations. Principal component analysis via ddRAD-seq data explained 30.29% and 33.98% of the total genetic variation in NlaIII_Msel and EcoRI_Msel, respectively. This study demonstrated that ddRAD-seq with the EcoRI_Msel enzyme combination is the most suitable GBS approach for genome sampling and SNP genotyping in safflower.

The online version contains supplementary material available at 10.1038/s41598-025-06706-2.

## Linked entities

- **Species:** Carthamus tinctorius (taxon 4222)

## Full-text entities

- **Species:** Carthamus tinctorius (safflower, species) [taxon 4222]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12217066/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12217066/full.md

## References

13 references — full list in the complete paper: https://tomesphere.com/paper/PMC12217066/full.md

---
Source: https://tomesphere.com/paper/PMC12217066