Anticancer Peptides Classification Using Long-Short-Term Memory With Novel Feature Representation

Cancer treatment is a challenging endeavor because of the intricacy, heterogeneity, and diversity of cancer causes. Comprehensive therapeutic approaches are crucial for cancer treatment. Anticancer peptides (ACPs) present a potentially effective therapeutic option. However, the extensive identificat...

Full description

Saved in:
Bibliographic Details
Main Authors: Nazer Al Tahifah, Muhammad Sohail Ibrahim, Erum Rehman, Naveed Ahmed, Abdul Wahab, Shujaat Khan
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10816412/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841563306074046464
author Nazer Al Tahifah
Muhammad Sohail Ibrahim
Erum Rehman
Naveed Ahmed
Abdul Wahab
Shujaat Khan
author_facet Nazer Al Tahifah
Muhammad Sohail Ibrahim
Erum Rehman
Naveed Ahmed
Abdul Wahab
Shujaat Khan
author_sort Nazer Al Tahifah
collection DOAJ
description Cancer treatment is a challenging endeavor because of the intricacy, heterogeneity, and diversity of cancer causes. Comprehensive therapeutic approaches are crucial for cancer treatment. Anticancer peptides (ACPs) present a potentially effective therapeutic option. However, the extensive identification and synthesis of these peptides present a persistent difficulty that calls for the creation of effective prediction techniques. Existing techniques either suffer from low accuracy or employ high-dimensional feature sets, frequently producing sparse features and leading to ineffective model designs. This work presents a novel set of features and a long-short-term-memory (LSTM)-based classification strategy to create an efficient model. The suggested feature set includes three new and two modern feature extraction methods. The binary profile feature and k-mer sparse matrix of the reduced amino acid alphabet are part of the modern feature set. The combination of the composition of the K-spaced side chain pairs (CKSSCP), the composition of the K-spaced electrically charged side chain pairs (CKSECSCP), and the combination of [pk(CO2H)] + [pk(NH2)] + [pk(R)] + [isoelectric point] is used to derive the novel features. The suggested LSTM model is trained using the combined feature set. The trials are carried out with a k-fold cross-validation method on benchmark datasets. The results indicate that the proposed model outperforms alternative ACP classification techniques in terms of Mathew&#x2019;s correlation coefficient (MCC) and accuracy. The ACP740 dataset with 5-folds yields an MCC score of 75%, which is 12%, 11%, 3%, and 8% greater than those of the ACP-DL, ACP-DA, ACP-MHCNN, and ACP-KSRC approaches, respectively. For the ACP344 dataset with 10-folds, the proposed method achieves an MCC score of 85.14%, which is 23% and 2% higher than the MCC scores of ACP-DL and SAP methods, respectively. Better classification performance offered by the proposed approach could help identify new ACPs and better understand their structural and chemical characteristics. The source code and the datasets are available on the author&#x2019;s GitHub page (<uri>https://github.com/Shujaat123/ACP-LSTM-NFR</uri>).
format Article
id doaj-art-91e098eae0a84a91b9f7e09ebb65e6df
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-91e098eae0a84a91b9f7e09ebb65e6df2025-01-03T00:01:47ZengIEEEIEEE Access2169-35362025-01-0113677910.1109/ACCESS.2024.352306810816412Anticancer Peptides Classification Using Long-Short-Term Memory With Novel Feature RepresentationNazer Al Tahifah0https://orcid.org/0009-0009-1448-7685Muhammad Sohail Ibrahim1https://orcid.org/0000-0002-1387-0879Erum Rehman2https://orcid.org/0000-0003-0939-1880Naveed Ahmed3https://orcid.org/0000-0002-9322-0373Abdul Wahab4https://orcid.org/0000-0002-9179-7427Shujaat Khan5https://orcid.org/0000-0001-9676-6817Department of Computer Engineering, College of Computing and Mathematics, King Fahd University of Petroleum &#x0026; Minerals, Dhahran, Saudi ArabiaDepartment of Mechanical Systems Engineering, Kumoh National Institute of Technology, Gumi-si, South KoreaDepartment of Mathematics, Nazarbayev University, Astana, KazakhstanDepartment of Mathematics and Natural Sciences, Center for Applied Mathematics and Bio-Informatics (CAMB), Gulf University for Science and Technology (GUST), Mubarak Al-Abdullah, KuwaitDepartment of Mathematics, College of Science, Sultan Qaboos University, Muscat, OmanDepartment of Computer Engineering, College of Computing and Mathematics, King Fahd University of Petroleum &#x0026; Minerals, Dhahran, Saudi ArabiaCancer treatment is a challenging endeavor because of the intricacy, heterogeneity, and diversity of cancer causes. Comprehensive therapeutic approaches are crucial for cancer treatment. Anticancer peptides (ACPs) present a potentially effective therapeutic option. However, the extensive identification and synthesis of these peptides present a persistent difficulty that calls for the creation of effective prediction techniques. Existing techniques either suffer from low accuracy or employ high-dimensional feature sets, frequently producing sparse features and leading to ineffective model designs. This work presents a novel set of features and a long-short-term-memory (LSTM)-based classification strategy to create an efficient model. The suggested feature set includes three new and two modern feature extraction methods. The binary profile feature and k-mer sparse matrix of the reduced amino acid alphabet are part of the modern feature set. The combination of the composition of the K-spaced side chain pairs (CKSSCP), the composition of the K-spaced electrically charged side chain pairs (CKSECSCP), and the combination of [pk(CO2H)] + [pk(NH2)] + [pk(R)] + [isoelectric point] is used to derive the novel features. The suggested LSTM model is trained using the combined feature set. The trials are carried out with a k-fold cross-validation method on benchmark datasets. The results indicate that the proposed model outperforms alternative ACP classification techniques in terms of Mathew&#x2019;s correlation coefficient (MCC) and accuracy. The ACP740 dataset with 5-folds yields an MCC score of 75%, which is 12%, 11%, 3%, and 8% greater than those of the ACP-DL, ACP-DA, ACP-MHCNN, and ACP-KSRC approaches, respectively. For the ACP344 dataset with 10-folds, the proposed method achieves an MCC score of 85.14%, which is 23% and 2% higher than the MCC scores of ACP-DL and SAP methods, respectively. Better classification performance offered by the proposed approach could help identify new ACPs and better understand their structural and chemical characteristics. The source code and the datasets are available on the author&#x2019;s GitHub page (<uri>https://github.com/Shujaat123/ACP-LSTM-NFR</uri>).https://ieeexplore.ieee.org/document/10816412/INDEX TERMS Anticancer peptides (ACPs)composition of K-spaced amino-acid pairs (CKSAAP)long-short-term-memory (LSTM)composition of the K-spaced side chain pairs (CKSSCP)composition of the K-spaced electrically charged side chain pairs (CKSECSCP)isoelectric point (pI)
spellingShingle Nazer Al Tahifah
Muhammad Sohail Ibrahim
Erum Rehman
Naveed Ahmed
Abdul Wahab
Shujaat Khan
Anticancer Peptides Classification Using Long-Short-Term Memory With Novel Feature Representation
IEEE Access
INDEX TERMS Anticancer peptides (ACPs)
composition of K-spaced amino-acid pairs (CKSAAP)
long-short-term-memory (LSTM)
composition of the K-spaced side chain pairs (CKSSCP)
composition of the K-spaced electrically charged side chain pairs (CKSECSCP)
isoelectric point (pI)
title Anticancer Peptides Classification Using Long-Short-Term Memory With Novel Feature Representation
title_full Anticancer Peptides Classification Using Long-Short-Term Memory With Novel Feature Representation
title_fullStr Anticancer Peptides Classification Using Long-Short-Term Memory With Novel Feature Representation
title_full_unstemmed Anticancer Peptides Classification Using Long-Short-Term Memory With Novel Feature Representation
title_short Anticancer Peptides Classification Using Long-Short-Term Memory With Novel Feature Representation
title_sort anticancer peptides classification using long short term memory with novel feature representation
topic INDEX TERMS Anticancer peptides (ACPs)
composition of K-spaced amino-acid pairs (CKSAAP)
long-short-term-memory (LSTM)
composition of the K-spaced side chain pairs (CKSSCP)
composition of the K-spaced electrically charged side chain pairs (CKSECSCP)
isoelectric point (pI)
url https://ieeexplore.ieee.org/document/10816412/
work_keys_str_mv AT nazeraltahifah anticancerpeptidesclassificationusinglongshorttermmemorywithnovelfeaturerepresentation
AT muhammadsohailibrahim anticancerpeptidesclassificationusinglongshorttermmemorywithnovelfeaturerepresentation
AT erumrehman anticancerpeptidesclassificationusinglongshorttermmemorywithnovelfeaturerepresentation
AT naveedahmed anticancerpeptidesclassificationusinglongshorttermmemorywithnovelfeaturerepresentation
AT abdulwahab anticancerpeptidesclassificationusinglongshorttermmemorywithnovelfeaturerepresentation
AT shujaatkhan anticancerpeptidesclassificationusinglongshorttermmemorywithnovelfeaturerepresentation