Cov-trans: an efficient algorithm for discontinuous transcript assembly in coronaviruses

Abstract Background Discontinuous transcription allows coronaviruses to efficiently replicate and transmit within host cells, enhancing their adaptability and survival. Assembling viral transcripts is crucial for virology research and the development of antiviral strategies. However, traditional tra...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaoyu Guo, Zhenming Wu, Shu Zhang, Jin Zhao
Format: Article
Language:English
Published: BMC 2024-12-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-024-11179-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841559811170238464
author Xiaoyu Guo
Zhenming Wu
Shu Zhang
Jin Zhao
author_facet Xiaoyu Guo
Zhenming Wu
Shu Zhang
Jin Zhao
author_sort Xiaoyu Guo
collection DOAJ
description Abstract Background Discontinuous transcription allows coronaviruses to efficiently replicate and transmit within host cells, enhancing their adaptability and survival. Assembling viral transcripts is crucial for virology research and the development of antiviral strategies. However, traditional transcript assembly methods primarily designed for variable alternative splicing events in eukaryotes are not suitable for the viral transcript assembly problem. The current algorithms designed for assembling viral transcripts often struggle with low accuracy in determining the transcript boundaries. There is an urgent need to develop a highly accurate viral transcript assembly algorithm. Results In this work, we propose Cov-trans, a reference-based transcript assembler specifically tailored for the discontinuous transcription of coronaviruses. Cov-trans first identifies canonical transcripts based on discontinuous transcription mechanisms, start and stop codons, as well as reads alignment information. Subsequently, it formulates the assembly of non-canonical transcripts as a path extraction problem, and introduces a mixed integer linear programming to recover these non-canonical transcripts. Conclusion Experimental results show that Cov-trans outperforms other assemblers in both accuracy and recall, with a notable strength in accurately identifying the boundaries of transcripts. Cov-trans is freely available at https://github.com/computer-Bioinfo/Cov-trans.git .
format Article
id doaj-art-63d288044be54ceebcfc0be0e8424028
institution Kabale University
issn 1471-2164
language English
publishDate 2024-12-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj-art-63d288044be54ceebcfc0be0e84240282025-01-05T12:09:26ZengBMCBMC Genomics1471-21642024-12-0125111110.1186/s12864-024-11179-0Cov-trans: an efficient algorithm for discontinuous transcript assembly in coronavirusesXiaoyu Guo0Zhenming Wu1Shu Zhang2Jin Zhao3School of Computer Science and Technology, Qingdao UniversitySchool of Computer Science and Technology, Qingdao UniversitySchool of Computer Science and Technology, Qingdao UniversitySchool of Computer Science and Technology, Qingdao UniversityAbstract Background Discontinuous transcription allows coronaviruses to efficiently replicate and transmit within host cells, enhancing their adaptability and survival. Assembling viral transcripts is crucial for virology research and the development of antiviral strategies. However, traditional transcript assembly methods primarily designed for variable alternative splicing events in eukaryotes are not suitable for the viral transcript assembly problem. The current algorithms designed for assembling viral transcripts often struggle with low accuracy in determining the transcript boundaries. There is an urgent need to develop a highly accurate viral transcript assembly algorithm. Results In this work, we propose Cov-trans, a reference-based transcript assembler specifically tailored for the discontinuous transcription of coronaviruses. Cov-trans first identifies canonical transcripts based on discontinuous transcription mechanisms, start and stop codons, as well as reads alignment information. Subsequently, it formulates the assembly of non-canonical transcripts as a path extraction problem, and introduces a mixed integer linear programming to recover these non-canonical transcripts. Conclusion Experimental results show that Cov-trans outperforms other assemblers in both accuracy and recall, with a notable strength in accurately identifying the boundaries of transcripts. Cov-trans is freely available at https://github.com/computer-Bioinfo/Cov-trans.git .https://doi.org/10.1186/s12864-024-11179-0Referenced-based assemblyMixed integer linear programmingDiscontinuous transcriptionCoronaviruses
spellingShingle Xiaoyu Guo
Zhenming Wu
Shu Zhang
Jin Zhao
Cov-trans: an efficient algorithm for discontinuous transcript assembly in coronaviruses
BMC Genomics
Referenced-based assembly
Mixed integer linear programming
Discontinuous transcription
Coronaviruses
title Cov-trans: an efficient algorithm for discontinuous transcript assembly in coronaviruses
title_full Cov-trans: an efficient algorithm for discontinuous transcript assembly in coronaviruses
title_fullStr Cov-trans: an efficient algorithm for discontinuous transcript assembly in coronaviruses
title_full_unstemmed Cov-trans: an efficient algorithm for discontinuous transcript assembly in coronaviruses
title_short Cov-trans: an efficient algorithm for discontinuous transcript assembly in coronaviruses
title_sort cov trans an efficient algorithm for discontinuous transcript assembly in coronaviruses
topic Referenced-based assembly
Mixed integer linear programming
Discontinuous transcription
Coronaviruses
url https://doi.org/10.1186/s12864-024-11179-0
work_keys_str_mv AT xiaoyuguo covtransanefficientalgorithmfordiscontinuoustranscriptassemblyincoronaviruses
AT zhenmingwu covtransanefficientalgorithmfordiscontinuoustranscriptassemblyincoronaviruses
AT shuzhang covtransanefficientalgorithmfordiscontinuoustranscriptassemblyincoronaviruses
AT jinzhao covtransanefficientalgorithmfordiscontinuoustranscriptassemblyincoronaviruses