deepTFBS: Improving within‐ and Cross‐Species Prediction of Transcription Factor Binding Using Deep Multi‐Task and Transfer Learning
Abstract The precise prediction of transcription factor binding sites (TFBSs) is crucial in understanding gene regulation. In this study, deepTFBS, a comprehensive deep learning (DL) framework that builds a robust DNA language model of TF binding grammar for accurately predicting TFBSs within and ac...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2025-08-01
|
| Series: | Advanced Science |
| Subjects: | |
| Online Access: | https://doi.org/10.1002/advs.202503135 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849233238966403072 |
|---|---|
| author | Jingjing Zhai Yuzhou Zhang Chujun Zhang Xiaotong Yin Minggui Song Chenglong Tang Pengjun Ding Zenglin Li Chuang Ma |
| author_facet | Jingjing Zhai Yuzhou Zhang Chujun Zhang Xiaotong Yin Minggui Song Chenglong Tang Pengjun Ding Zenglin Li Chuang Ma |
| author_sort | Jingjing Zhai |
| collection | DOAJ |
| description | Abstract The precise prediction of transcription factor binding sites (TFBSs) is crucial in understanding gene regulation. In this study, deepTFBS, a comprehensive deep learning (DL) framework that builds a robust DNA language model of TF binding grammar for accurately predicting TFBSs within and across plant species is presented. Taking advantages of multi‐task DL and transfer learning, deepTFBS is capable of leveraging the knowledge learned from large‐scale TF binding profiles to enhance the prediction of TFBSs under small‐sample training and cross‐species prediction tasks. When tested using available information on 359 Arabidopsis TFs, deepTFBS outperformed previously described prediction strategies, including position weight matrix, deepSEA and DanQ, with a 244.49%, 49.15%, and 23.32% improvement of the area under the precision‐recall curve (PRAUC), respectively. Further cross‐species prediction of TFBS in wheat showed that deepTFBS yielded a significant PRAUC improvement of 30.6% over these three baseline models. deepTFBS can also utilize information from gene conservation and binding motifs, enabling efficient TFBS prediction in species where experimental data availability is limited. A case study, focusing on the WUSCHEL (WUS) transcription factor, illustrated the potential use of deepTFBS in cross‐species applications, in our example between Arabidopsis and wheat. deepTFBS is publically available at https://github.com/cma2015/deepTFBS. |
| format | Article |
| id | doaj-art-2a2d8173f45746749a110e956b4a9388 |
| institution | Kabale University |
| issn | 2198-3844 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Wiley |
| record_format | Article |
| series | Advanced Science |
| spelling | doaj-art-2a2d8173f45746749a110e956b4a93882025-08-20T11:56:10ZengWileyAdvanced Science2198-38442025-08-011230n/an/a10.1002/advs.202503135deepTFBS: Improving within‐ and Cross‐Species Prediction of Transcription Factor Binding Using Deep Multi‐Task and Transfer LearningJingjing Zhai0Yuzhou Zhang1Chujun Zhang2Xiaotong Yin3Minggui Song4Chenglong Tang5Pengjun Ding6Zenglin Li7Chuang Ma8State Key Laboratory for Crop Stress Resistance and High‐Efficiency Production, Center of Bioinformatics, College of Life Sciences Northwest A&F University Yangling Shaanxi 712100 ChinaState Key Laboratory for Crop Stress Resistance and High‐Efficiency Production, Center of Bioinformatics, College of Life Sciences Northwest A&F University Yangling Shaanxi 712100 ChinaState Key Laboratory for Crop Stress Resistance and High‐Efficiency Production, Center of Bioinformatics, College of Life Sciences Northwest A&F University Yangling Shaanxi 712100 ChinaCollege of Life Sciences Northwest A&F University Yangling Shaanxi 712100 ChinaState Key Laboratory for Crop Stress Resistance and High‐Efficiency Production, Center of Bioinformatics, College of Life Sciences Northwest A&F University Yangling Shaanxi 712100 ChinaCollege of Life Sciences Northwest A&F University Yangling Shaanxi 712100 ChinaState Key Laboratory for Crop Stress Resistance and High‐Efficiency Production, Center of Bioinformatics, College of Life Sciences Northwest A&F University Yangling Shaanxi 712100 ChinaState Key Laboratory for Crop Stress Resistance and High‐Efficiency Production, Center of Bioinformatics, College of Life Sciences Northwest A&F University Yangling Shaanxi 712100 ChinaState Key Laboratory for Crop Stress Resistance and High‐Efficiency Production, Center of Bioinformatics, College of Life Sciences Northwest A&F University Yangling Shaanxi 712100 ChinaAbstract The precise prediction of transcription factor binding sites (TFBSs) is crucial in understanding gene regulation. In this study, deepTFBS, a comprehensive deep learning (DL) framework that builds a robust DNA language model of TF binding grammar for accurately predicting TFBSs within and across plant species is presented. Taking advantages of multi‐task DL and transfer learning, deepTFBS is capable of leveraging the knowledge learned from large‐scale TF binding profiles to enhance the prediction of TFBSs under small‐sample training and cross‐species prediction tasks. When tested using available information on 359 Arabidopsis TFs, deepTFBS outperformed previously described prediction strategies, including position weight matrix, deepSEA and DanQ, with a 244.49%, 49.15%, and 23.32% improvement of the area under the precision‐recall curve (PRAUC), respectively. Further cross‐species prediction of TFBS in wheat showed that deepTFBS yielded a significant PRAUC improvement of 30.6% over these three baseline models. deepTFBS can also utilize information from gene conservation and binding motifs, enabling efficient TFBS prediction in species where experimental data availability is limited. A case study, focusing on the WUSCHEL (WUS) transcription factor, illustrated the potential use of deepTFBS in cross‐species applications, in our example between Arabidopsis and wheat. deepTFBS is publically available at https://github.com/cma2015/deepTFBS.https://doi.org/10.1002/advs.202503135bioinformaticscross‐species predictiondeep learningmachine learningtranscriptional regulatory network |
| spellingShingle | Jingjing Zhai Yuzhou Zhang Chujun Zhang Xiaotong Yin Minggui Song Chenglong Tang Pengjun Ding Zenglin Li Chuang Ma deepTFBS: Improving within‐ and Cross‐Species Prediction of Transcription Factor Binding Using Deep Multi‐Task and Transfer Learning Advanced Science bioinformatics cross‐species prediction deep learning machine learning transcriptional regulatory network |
| title | deepTFBS: Improving within‐ and Cross‐Species Prediction of Transcription Factor Binding Using Deep Multi‐Task and Transfer Learning |
| title_full | deepTFBS: Improving within‐ and Cross‐Species Prediction of Transcription Factor Binding Using Deep Multi‐Task and Transfer Learning |
| title_fullStr | deepTFBS: Improving within‐ and Cross‐Species Prediction of Transcription Factor Binding Using Deep Multi‐Task and Transfer Learning |
| title_full_unstemmed | deepTFBS: Improving within‐ and Cross‐Species Prediction of Transcription Factor Binding Using Deep Multi‐Task and Transfer Learning |
| title_short | deepTFBS: Improving within‐ and Cross‐Species Prediction of Transcription Factor Binding Using Deep Multi‐Task and Transfer Learning |
| title_sort | deeptfbs improving within and cross species prediction of transcription factor binding using deep multi task and transfer learning |
| topic | bioinformatics cross‐species prediction deep learning machine learning transcriptional regulatory network |
| url | https://doi.org/10.1002/advs.202503135 |
| work_keys_str_mv | AT jingjingzhai deeptfbsimprovingwithinandcrossspeciespredictionoftranscriptionfactorbindingusingdeepmultitaskandtransferlearning AT yuzhouzhang deeptfbsimprovingwithinandcrossspeciespredictionoftranscriptionfactorbindingusingdeepmultitaskandtransferlearning AT chujunzhang deeptfbsimprovingwithinandcrossspeciespredictionoftranscriptionfactorbindingusingdeepmultitaskandtransferlearning AT xiaotongyin deeptfbsimprovingwithinandcrossspeciespredictionoftranscriptionfactorbindingusingdeepmultitaskandtransferlearning AT mingguisong deeptfbsimprovingwithinandcrossspeciespredictionoftranscriptionfactorbindingusingdeepmultitaskandtransferlearning AT chenglongtang deeptfbsimprovingwithinandcrossspeciespredictionoftranscriptionfactorbindingusingdeepmultitaskandtransferlearning AT pengjunding deeptfbsimprovingwithinandcrossspeciespredictionoftranscriptionfactorbindingusingdeepmultitaskandtransferlearning AT zenglinli deeptfbsimprovingwithinandcrossspeciespredictionoftranscriptionfactorbindingusingdeepmultitaskandtransferlearning AT chuangma deeptfbsimprovingwithinandcrossspeciespredictionoftranscriptionfactorbindingusingdeepmultitaskandtransferlearning |