DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes

Abstract Deep learning models (DLMs) have gained importance in predicting, detecting, translating, and classifying a diversity of inputs. In bioinformatics, DLMs have been used to predict protein structures, transcription factor-binding sites, and promoters. In this work, we propose a hybrid model t...

Full description

Saved in:

Bibliographic Details
Main Authors:	Leonardo Ledesma-Dominguez, Erik Carbajal-Degante, Gabriel Moreno-Hagelsieb, Ernesto Pérez-Rueda
Format:	Article
Language:	English
Published:	Nature Portfolio 2024-04-01
Series:	Scientific Reports
Online Access:	https://doi.org/10.1038/s41598-024-59487-5
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846165346779660288
author	Leonardo Ledesma-Dominguez Erik Carbajal-Degante Gabriel Moreno-Hagelsieb Ernesto Pérez-Rueda
author_facet	Leonardo Ledesma-Dominguez Erik Carbajal-Degante Gabriel Moreno-Hagelsieb Ernesto Pérez-Rueda
author_sort	Leonardo Ledesma-Dominguez
collection	DOAJ
description	Abstract Deep learning models (DLMs) have gained importance in predicting, detecting, translating, and classifying a diversity of inputs. In bioinformatics, DLMs have been used to predict protein structures, transcription factor-binding sites, and promoters. In this work, we propose a hybrid model to identify transcription factors (TFs) among prokaryotic and eukaryotic protein sequences, named Deep Regulation (DeepReg) model. Two architectures were used in the DL model: a convolutional neural network (CNN), and a bidirectional long-short-term memory (BiLSTM). DeepReg reached a precision of 0.99, a recall of 0.97, and an F1-score of 0.98. The quality of our predictions, the bias-variance trade-off approach, and the characterization of new TF predictions were evaluated and compared against those produced by DeepTFactor, as well as against experimental data from three model organisms. Predictions based on our DLM tended to exhibit less variance and bias than those from DeepTFactor, thus increasing reliability and decreasing overfitting.
format	Article
id	doaj-art-a57e838fde374b9aa17f0e0e5b4b5edf
institution	Kabale University
issn	2045-2322
language	English
publishDate	2024-04-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-a57e838fde374b9aa17f0e0e5b4b5edf2024-11-17T12:21:24ZengNature PortfolioScientific Reports2045-23222024-04-0114111110.1038/s41598-024-59487-5DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomesLeonardo Ledesma-Dominguez0Erik Carbajal-Degante1Gabriel Moreno-Hagelsieb2Ernesto Pérez-Rueda3Posgrado en Ciencia en Ingeniería de la Computación, Universidad Nacional Autónoma de MéxicoCoordinación de Universidad Abierta, Innovación Educativa y Educación a Distancia (CUAIEED), Universidad Nacional Autónoma de MéxicoDepartment of Biology, Wilfrid Laurier UniversityInstituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Unidad Académica del Estado de Yucatán, Universidad Nacional Autónoma de MéxicoAbstract Deep learning models (DLMs) have gained importance in predicting, detecting, translating, and classifying a diversity of inputs. In bioinformatics, DLMs have been used to predict protein structures, transcription factor-binding sites, and promoters. In this work, we propose a hybrid model to identify transcription factors (TFs) among prokaryotic and eukaryotic protein sequences, named Deep Regulation (DeepReg) model. Two architectures were used in the DL model: a convolutional neural network (CNN), and a bidirectional long-short-term memory (BiLSTM). DeepReg reached a precision of 0.99, a recall of 0.97, and an F1-score of 0.98. The quality of our predictions, the bias-variance trade-off approach, and the characterization of new TF predictions were evaluated and compared against those produced by DeepTFactor, as well as against experimental data from three model organisms. Predictions based on our DLM tended to exhibit less variance and bias than those from DeepTFactor, thus increasing reliability and decreasing overfitting.https://doi.org/10.1038/s41598-024-59487-5
spellingShingle	Leonardo Ledesma-Dominguez Erik Carbajal-Degante Gabriel Moreno-Hagelsieb Ernesto Pérez-Rueda DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes Scientific Reports
title	DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes
title_full	DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes
title_fullStr	DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes
title_full_unstemmed	DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes
title_short	DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes
title_sort	deepreg a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes
url	https://doi.org/10.1038/s41598-024-59487-5
work_keys_str_mv	AT leonardoledesmadominguez deepregadeeplearninghybridmodelforpredictingtranscriptionfactorsineukaryoticandprokaryoticgenomes AT erikcarbajaldegante deepregadeeplearninghybridmodelforpredictingtranscriptionfactorsineukaryoticandprokaryoticgenomes AT gabrielmorenohagelsieb deepregadeeplearninghybridmodelforpredictingtranscriptionfactorsineukaryoticandprokaryoticgenomes AT ernestoperezrueda deepregadeeplearninghybridmodelforpredictingtranscriptionfactorsineukaryoticandprokaryoticgenomes

DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes

Similar Items