A dataset of 40 assembled and annotated transcriptomes from 34 species in Silene and related generaMendeley Data

A dataset of 40 assembled and annotated transcriptomes from 34 different species sampled from phylogenetically diverse parts of the flowering plant genus Silene (Caryophyllaceae) and the related genera Agrostemma, Atocion, Eudianthe, Heliosperma, Petrocoptis and Viscaria. RNA extracted from roots, s...

Full description

Saved in:
Bibliographic Details
Main Authors: Patrik Cangren, Yann J.K. Bertrand, John M. Braverman, Gregor Duncan Gilfillan, Matthew B. Hamilton, Bengt Oxelman
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340924010564
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846164539455832064
author Patrik Cangren
Yann J.K. Bertrand
John M. Braverman
Gregor Duncan Gilfillan
Matthew B. Hamilton
Bengt Oxelman
author_facet Patrik Cangren
Yann J.K. Bertrand
John M. Braverman
Gregor Duncan Gilfillan
Matthew B. Hamilton
Bengt Oxelman
author_sort Patrik Cangren
collection DOAJ
description A dataset of 40 assembled and annotated transcriptomes from 34 different species sampled from phylogenetically diverse parts of the flowering plant genus Silene (Caryophyllaceae) and the related genera Agrostemma, Atocion, Eudianthe, Heliosperma, Petrocoptis and Viscaria. RNA extracted from roots, stems, leaves, buds and flowers were sequenced using paired end reads on the Illumina Hiseq platform. A total of 716 million raw reads were produced and assembled into 2.67 million isogroups (“genes”). Contigs from all samples were annotated using UniProt/SwissProt and assigned with GO-terms. A total of 974274 annotations were made (per sample average 24357, stdev 7034), giving an annotation proportion of 37% (per sample average 39%, stdev 9.75%). 741087 of the annotations had taxonomic identities within Magnoliopsida (per sample average 18527, stdev 3931), resulting in assignment of 4519488 GO-terms (per sample average 112987, stdev 22536). The data set can be further utilized for biological research and phylogenetic studies, evolutionary questions, functional analyses of genes, polyploidy as well as for marker development.
format Article
id doaj-art-4db2485cf56c45a59b4ec0ae49861ffe
institution Kabale University
issn 2352-3409
language English
publishDate 2024-12-01
publisher Elsevier
record_format Article
series Data in Brief
spelling doaj-art-4db2485cf56c45a59b4ec0ae49861ffe2024-11-18T04:33:20ZengElsevierData in Brief2352-34092024-12-0157111094A dataset of 40 assembled and annotated transcriptomes from 34 species in Silene and related generaMendeley DataPatrik Cangren0Yann J.K. Bertrand1John M. Braverman2Gregor Duncan Gilfillan3Matthew B. Hamilton4Bengt Oxelman5Department of Biological and Environmental Sciences, University of Gothenburg, Medicinaregatan 7B, Goteborg 413 90, Sweden; Corresponding author.Laboratory of Molecular Biology and Bioinformatics. Institute of Botany, Academy of Sciences of the Czech Republic, CZ-252 43 Průhonice, Czech RepublicDepartment of Biology, The Science Center, Saint Josephs University, 5600 City Ave. Philadelphia, PA 19131, USADepartment of Medical Genetics, Oslo University Hospital and University of Oslo, Kirkeveien 166, 0450 Oslo, NorwayDepartment of Biology, Georgetown University, 37th and O Streets NW Washington, DC 20057, USADepartment of Biological and Environmental Sciences, University of Gothenburg, Medicinaregatan 7B, Goteborg 413 90, SwedenA dataset of 40 assembled and annotated transcriptomes from 34 different species sampled from phylogenetically diverse parts of the flowering plant genus Silene (Caryophyllaceae) and the related genera Agrostemma, Atocion, Eudianthe, Heliosperma, Petrocoptis and Viscaria. RNA extracted from roots, stems, leaves, buds and flowers were sequenced using paired end reads on the Illumina Hiseq platform. A total of 716 million raw reads were produced and assembled into 2.67 million isogroups (“genes”). Contigs from all samples were annotated using UniProt/SwissProt and assigned with GO-terms. A total of 974274 annotations were made (per sample average 24357, stdev 7034), giving an annotation proportion of 37% (per sample average 39%, stdev 9.75%). 741087 of the annotations had taxonomic identities within Magnoliopsida (per sample average 18527, stdev 3931), resulting in assignment of 4519488 GO-terms (per sample average 112987, stdev 22536). The data set can be further utilized for biological research and phylogenetic studies, evolutionary questions, functional analyses of genes, polyploidy as well as for marker development.http://www.sciencedirect.com/science/article/pii/S2352340924010564GenomicsPhylogeneticsRNA-transcriptsNucleotideAssemblyFunctional-annotation
spellingShingle Patrik Cangren
Yann J.K. Bertrand
John M. Braverman
Gregor Duncan Gilfillan
Matthew B. Hamilton
Bengt Oxelman
A dataset of 40 assembled and annotated transcriptomes from 34 species in Silene and related generaMendeley Data
Data in Brief
Genomics
Phylogenetics
RNA-transcripts
Nucleotide
Assembly
Functional-annotation
title A dataset of 40 assembled and annotated transcriptomes from 34 species in Silene and related generaMendeley Data
title_full A dataset of 40 assembled and annotated transcriptomes from 34 species in Silene and related generaMendeley Data
title_fullStr A dataset of 40 assembled and annotated transcriptomes from 34 species in Silene and related generaMendeley Data
title_full_unstemmed A dataset of 40 assembled and annotated transcriptomes from 34 species in Silene and related generaMendeley Data
title_short A dataset of 40 assembled and annotated transcriptomes from 34 species in Silene and related generaMendeley Data
title_sort dataset of 40 assembled and annotated transcriptomes from 34 species in silene and related generamendeley data
topic Genomics
Phylogenetics
RNA-transcripts
Nucleotide
Assembly
Functional-annotation
url http://www.sciencedirect.com/science/article/pii/S2352340924010564
work_keys_str_mv AT patrikcangren adatasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT yannjkbertrand adatasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT johnmbraverman adatasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT gregorduncangilfillan adatasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT matthewbhamilton adatasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT bengtoxelman adatasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT patrikcangren datasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT yannjkbertrand datasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT johnmbraverman datasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT gregorduncangilfillan datasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT matthewbhamilton datasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata
AT bengtoxelman datasetof40assembledandannotatedtranscriptomesfrom34speciesinsileneandrelatedgeneramendeleydata