Prediction of transcript isoforms and identification of tissue-specific genes in cucumber
Abstract Background Identification of global transcriptional events is crucial for genome annotation, as accurate annotation enhances the efficiency and comparability of genomic information across species. However, the annotation of transcripts in the cucumber genome remains to be improved, and many...
Saved in:
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2025-01-01
|
Series: | BMC Genomics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12864-025-11212-w |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841545010208571392 |
---|---|
author | Wenjiao Wang Chengcheng Shen Xinqiang Wen Anqi Li Qi Gao Zhaoying Xu Yuping Wei Yushun Li Dailu Guan Bin Liu |
author_facet | Wenjiao Wang Chengcheng Shen Xinqiang Wen Anqi Li Qi Gao Zhaoying Xu Yuping Wei Yushun Li Dailu Guan Bin Liu |
author_sort | Wenjiao Wang |
collection | DOAJ |
description | Abstract Background Identification of global transcriptional events is crucial for genome annotation, as accurate annotation enhances the efficiency and comparability of genomic information across species. However, the annotation of transcripts in the cucumber genome remains to be improved, and many transcriptional events have not been well studied. Results We collected 1,904 high-quality public cucumber transcriptome samples from the National Center for Biotechnology Information (NCBI) to identify and annotate transcript isoforms in the cucumber genome. Over 44.26 billion Q30 clean reads were mapped to the cucumber genome with an average mapping rate of 92.75%. Transcriptome assembly identified 151,453 transcripts spanning 20,442 loci. Among these, 12.7% of transcripts exactly matched annotated genes in the cucumber reference genome. More than 80% of the transcripts were classified as novel isoforms. Approximately 96.6% of these isoforms originated from known gene loci, while around 3.3% were derived from novel gene loci. Coding potential prediction identified 4,543 long non-coding RNAs (lncRNAs) across 3,376 loci. Building on these results, we identified tissue-specific transcripts in 10 tissues. Among that, 1,655 annotated genes and 4,214 predicted transcripts were considered as tissue-specific. The root exhibited the highest number of tissue-specific transcripts, followed by shoot apex. Subsequent selective pressure analysis revealed that tissue-specific regions experienced stronger directional selection compared to non-specific regions. Conclusions By analyzing thousands of published transcriptome data, we identified abundant transcriptional events and tissue-specific transcripts in cucumbers. This study presented here adds the great value to the public data and offers insights for further exploration of a more comprehensive tissue regulatory network in cucumber. |
format | Article |
id | doaj-art-dec203996671415798a22381637ecbb1 |
institution | Kabale University |
issn | 1471-2164 |
language | English |
publishDate | 2025-01-01 |
publisher | BMC |
record_format | Article |
series | BMC Genomics |
spelling | doaj-art-dec203996671415798a22381637ecbb12025-01-12T12:09:10ZengBMCBMC Genomics1471-21642025-01-0126111210.1186/s12864-025-11212-wPrediction of transcript isoforms and identification of tissue-specific genes in cucumberWenjiao Wang0Chengcheng Shen1Xinqiang Wen2Anqi Li3Qi Gao4Zhaoying Xu5Yuping Wei6Yushun Li7Dailu Guan8Bin Liu9College of Horticulture, Shanxi Agricultural UniversityCollege of Horticulture, Shanxi Agricultural UniversityCollege of Horticulture, Shanxi Agricultural UniversityCollege of Horticulture, Shanxi Agricultural UniversityCollege of Horticulture, Shanxi Agricultural UniversityCollege of Horticulture, Shanxi Agricultural UniversityCollege of Horticulture, Shanxi Agricultural UniversityHami-melon Research Center, Xinjiang Academy of Agricultural SciencesDepartment of Animal Science, University of California DavisHami-melon Research Center, Xinjiang Academy of Agricultural SciencesAbstract Background Identification of global transcriptional events is crucial for genome annotation, as accurate annotation enhances the efficiency and comparability of genomic information across species. However, the annotation of transcripts in the cucumber genome remains to be improved, and many transcriptional events have not been well studied. Results We collected 1,904 high-quality public cucumber transcriptome samples from the National Center for Biotechnology Information (NCBI) to identify and annotate transcript isoforms in the cucumber genome. Over 44.26 billion Q30 clean reads were mapped to the cucumber genome with an average mapping rate of 92.75%. Transcriptome assembly identified 151,453 transcripts spanning 20,442 loci. Among these, 12.7% of transcripts exactly matched annotated genes in the cucumber reference genome. More than 80% of the transcripts were classified as novel isoforms. Approximately 96.6% of these isoforms originated from known gene loci, while around 3.3% were derived from novel gene loci. Coding potential prediction identified 4,543 long non-coding RNAs (lncRNAs) across 3,376 loci. Building on these results, we identified tissue-specific transcripts in 10 tissues. Among that, 1,655 annotated genes and 4,214 predicted transcripts were considered as tissue-specific. The root exhibited the highest number of tissue-specific transcripts, followed by shoot apex. Subsequent selective pressure analysis revealed that tissue-specific regions experienced stronger directional selection compared to non-specific regions. Conclusions By analyzing thousands of published transcriptome data, we identified abundant transcriptional events and tissue-specific transcripts in cucumbers. This study presented here adds the great value to the public data and offers insights for further exploration of a more comprehensive tissue regulatory network in cucumber.https://doi.org/10.1186/s12864-025-11212-wCucumberRNA-seqTranscript isoformTissue-specific |
spellingShingle | Wenjiao Wang Chengcheng Shen Xinqiang Wen Anqi Li Qi Gao Zhaoying Xu Yuping Wei Yushun Li Dailu Guan Bin Liu Prediction of transcript isoforms and identification of tissue-specific genes in cucumber BMC Genomics Cucumber RNA-seq Transcript isoform Tissue-specific |
title | Prediction of transcript isoforms and identification of tissue-specific genes in cucumber |
title_full | Prediction of transcript isoforms and identification of tissue-specific genes in cucumber |
title_fullStr | Prediction of transcript isoforms and identification of tissue-specific genes in cucumber |
title_full_unstemmed | Prediction of transcript isoforms and identification of tissue-specific genes in cucumber |
title_short | Prediction of transcript isoforms and identification of tissue-specific genes in cucumber |
title_sort | prediction of transcript isoforms and identification of tissue specific genes in cucumber |
topic | Cucumber RNA-seq Transcript isoform Tissue-specific |
url | https://doi.org/10.1186/s12864-025-11212-w |
work_keys_str_mv | AT wenjiaowang predictionoftranscriptisoformsandidentificationoftissuespecificgenesincucumber AT chengchengshen predictionoftranscriptisoformsandidentificationoftissuespecificgenesincucumber AT xinqiangwen predictionoftranscriptisoformsandidentificationoftissuespecificgenesincucumber AT anqili predictionoftranscriptisoformsandidentificationoftissuespecificgenesincucumber AT qigao predictionoftranscriptisoformsandidentificationoftissuespecificgenesincucumber AT zhaoyingxu predictionoftranscriptisoformsandidentificationoftissuespecificgenesincucumber AT yupingwei predictionoftranscriptisoformsandidentificationoftissuespecificgenesincucumber AT yushunli predictionoftranscriptisoformsandidentificationoftissuespecificgenesincucumber AT dailuguan predictionoftranscriptisoformsandidentificationoftissuespecificgenesincucumber AT binliu predictionoftranscriptisoformsandidentificationoftissuespecificgenesincucumber |