Human essential gene identification based on feature fusion and feature screening
Abstract Essential genes are necessary to sustain the life of a species under adequate nutritional conditions. These genes have attracted significant attention for their potential as drug targets, especially in developing broad‐spectrum antibacterial drugs. However, studying essential genes remains...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2024-12-01
|
| Series: | IET Systems Biology |
| Subjects: | |
| Online Access: | https://doi.org/10.1049/syb2.12105 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846110632428961792 |
|---|---|
| author | Zhao‐Yue Zhang Yue‐Er Fan Cheng‐Bing Huang Meng‐Ze Du |
| author_facet | Zhao‐Yue Zhang Yue‐Er Fan Cheng‐Bing Huang Meng‐Ze Du |
| author_sort | Zhao‐Yue Zhang |
| collection | DOAJ |
| description | Abstract Essential genes are necessary to sustain the life of a species under adequate nutritional conditions. These genes have attracted significant attention for their potential as drug targets, especially in developing broad‐spectrum antibacterial drugs. However, studying essential genes remains challenging due to their variability in specific environmental conditions. In this study, the authors aim to develop a powerful prediction model for identifying essential genes in humans. The authors first obtained the essential gene data from human cancer cell lines and characterised gene sequences using 7 feature encoding methods such as Kmer, the Composition of K‐spaced Nucleic Acid Pairs, and Z‐curve. Subsequently, feature fusion and feature optimisation strategies were employed to select the impactful features. Finally, machine learning algorithms were applied to construct the prediction models and evaluate their performance. The single‐feature‐based model achieved the highest area under the Receiver Operating Characteristic curve (AUC) of 0.830. After fusing and filtering these features, the classical machine learning models achieved the highest AUC at 0.823 while the deep learning model reached 0.860. Results obtained by the authors show that compared to using individual features, feature fusion and feature optimisation strategies significantly improved model performance. Moreover, the study provided an advantageous method for essential gene identification compared to other methods. |
| format | Article |
| id | doaj-art-b0b78c89d4a84d1e9b344f243a931161 |
| institution | Kabale University |
| issn | 1751-8849 1751-8857 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Wiley |
| record_format | Article |
| series | IET Systems Biology |
| spelling | doaj-art-b0b78c89d4a84d1e9b344f243a9311612024-12-23T18:41:56ZengWileyIET Systems Biology1751-88491751-88572024-12-0118622723710.1049/syb2.12105Human essential gene identification based on feature fusion and feature screeningZhao‐Yue Zhang0Yue‐Er Fan1Cheng‐Bing Huang2Meng‐Ze Du3School of Healthcare Technology Chengdu Neusoft University Chengdu ChinaSchool of Life Science and Technology University of Electronic Science and Technology of China Chengdu ChinaSchool of Computer Science and Technology ABa Teachers University Chengdu ChinaSchool of Healthcare Technology Chengdu Neusoft University Chengdu ChinaAbstract Essential genes are necessary to sustain the life of a species under adequate nutritional conditions. These genes have attracted significant attention for their potential as drug targets, especially in developing broad‐spectrum antibacterial drugs. However, studying essential genes remains challenging due to their variability in specific environmental conditions. In this study, the authors aim to develop a powerful prediction model for identifying essential genes in humans. The authors first obtained the essential gene data from human cancer cell lines and characterised gene sequences using 7 feature encoding methods such as Kmer, the Composition of K‐spaced Nucleic Acid Pairs, and Z‐curve. Subsequently, feature fusion and feature optimisation strategies were employed to select the impactful features. Finally, machine learning algorithms were applied to construct the prediction models and evaluate their performance. The single‐feature‐based model achieved the highest area under the Receiver Operating Characteristic curve (AUC) of 0.830. After fusing and filtering these features, the classical machine learning models achieved the highest AUC at 0.823 while the deep learning model reached 0.860. Results obtained by the authors show that compared to using individual features, feature fusion and feature optimisation strategies significantly improved model performance. Moreover, the study provided an advantageous method for essential gene identification compared to other methods.https://doi.org/10.1049/syb2.12105bioinformaticsessential genefeature selectionneural nets |
| spellingShingle | Zhao‐Yue Zhang Yue‐Er Fan Cheng‐Bing Huang Meng‐Ze Du Human essential gene identification based on feature fusion and feature screening IET Systems Biology bioinformatics essential gene feature selection neural nets |
| title | Human essential gene identification based on feature fusion and feature screening |
| title_full | Human essential gene identification based on feature fusion and feature screening |
| title_fullStr | Human essential gene identification based on feature fusion and feature screening |
| title_full_unstemmed | Human essential gene identification based on feature fusion and feature screening |
| title_short | Human essential gene identification based on feature fusion and feature screening |
| title_sort | human essential gene identification based on feature fusion and feature screening |
| topic | bioinformatics essential gene feature selection neural nets |
| url | https://doi.org/10.1049/syb2.12105 |
| work_keys_str_mv | AT zhaoyuezhang humanessentialgeneidentificationbasedonfeaturefusionandfeaturescreening AT yueerfan humanessentialgeneidentificationbasedonfeaturefusionandfeaturescreening AT chengbinghuang humanessentialgeneidentificationbasedonfeaturefusionandfeaturescreening AT mengzedu humanessentialgeneidentificationbasedonfeaturefusionandfeaturescreening |