Comparative analysis of RAD-seq methods for SNP discovery and genetic diversity assessment in oil seed crop safflower

Abstract Safflower (Carthamus tinctorius L.) is an important oilseed crop with diverse uses and the potential for genetic improvement. This study aimed to optimize genotyping-by-sequencing (GBS) for safflower via in silico and in vitro methods with two restriction site-associated DNA sequencing (RAD...

Full description

Saved in:
Bibliographic Details
Main Authors: Pooja Pathania, Gaddam Prasanna Kumar, Nishu Gupta, R. Parimalan, J. Radhamani, Rajesh Kumar, Sunil Shriram Gomashe, Palchamy Kadirvel, S. Rajkumar
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-06706-2
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Safflower (Carthamus tinctorius L.) is an important oilseed crop with diverse uses and the potential for genetic improvement. This study aimed to optimize genotyping-by-sequencing (GBS) for safflower via in silico and in vitro methods with two restriction site-associated DNA sequencing (RAD-seq) approaches, i.e., single restriction site-associated DNA sequencing (sdRAD-seq) and double-digest RAD sequencing (ddRAD-seq) and three restriction enzyme combinations (ApeKI, NlaIII_Msel, and EcoRI_Msel). Forty-two safflower accessions were selected for this study. In silico testing revealed that NlaIII_Msel generated the largest number of DNA fragments, followed by ApeKI and EcoRI_Msel. The in vitro results showed that ddRAD-seq outperformed sdRAD-seq in terms of raw read count, alignment rate, depth and breadth of coverage, and SNP detection. An alignment-free analysis using k-mer counting and sketching based on genetic distance further confirmed the superiority of ddRAD-seq. Gene-level k-mer validation identified more core genes in the ddRAD-seq data. Variant calling resulted in 6,721, 173,212, and 221,805 single nucleotide polymorphic sites (SNPs) for ApeKI, NlaIII_Msel, and EcoRI_Msel, respectively. SNP annotation and distribution analysis revealed that EcoRI_Msel captured more SNPs with fewer missing observations. Principal component analysis via ddRAD-seq data explained 30.29% and 33.98% of the total genetic variation in NlaIII_Msel and EcoRI_Msel, respectively. This study demonstrated that ddRAD-seq with the EcoRI_Msel enzyme combination is the most suitable GBS approach for genome sampling and SNP genotyping in safflower.
ISSN:2045-2322