Comparative analysis of RAD-seq methods for SNP discovery and genetic diversity assessment in oil seed crop safflower
Abstract Safflower (Carthamus tinctorius L.) is an important oilseed crop with diverse uses and the potential for genetic improvement. This study aimed to optimize genotyping-by-sequencing (GBS) for safflower via in silico and in vitro methods with two restriction site-associated DNA sequencing (RAD...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-06706-2 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Safflower (Carthamus tinctorius L.) is an important oilseed crop with diverse uses and the potential for genetic improvement. This study aimed to optimize genotyping-by-sequencing (GBS) for safflower via in silico and in vitro methods with two restriction site-associated DNA sequencing (RAD-seq) approaches, i.e., single restriction site-associated DNA sequencing (sdRAD-seq) and double-digest RAD sequencing (ddRAD-seq) and three restriction enzyme combinations (ApeKI, NlaIII_Msel, and EcoRI_Msel). Forty-two safflower accessions were selected for this study. In silico testing revealed that NlaIII_Msel generated the largest number of DNA fragments, followed by ApeKI and EcoRI_Msel. The in vitro results showed that ddRAD-seq outperformed sdRAD-seq in terms of raw read count, alignment rate, depth and breadth of coverage, and SNP detection. An alignment-free analysis using k-mer counting and sketching based on genetic distance further confirmed the superiority of ddRAD-seq. Gene-level k-mer validation identified more core genes in the ddRAD-seq data. Variant calling resulted in 6,721, 173,212, and 221,805 single nucleotide polymorphic sites (SNPs) for ApeKI, NlaIII_Msel, and EcoRI_Msel, respectively. SNP annotation and distribution analysis revealed that EcoRI_Msel captured more SNPs with fewer missing observations. Principal component analysis via ddRAD-seq data explained 30.29% and 33.98% of the total genetic variation in NlaIII_Msel and EcoRI_Msel, respectively. This study demonstrated that ddRAD-seq with the EcoRI_Msel enzyme combination is the most suitable GBS approach for genome sampling and SNP genotyping in safflower. |
|---|---|
| ISSN: | 2045-2322 |