Multi-S3P: Protein Secondary Structure Prediction With Specialized Multi-Network and Self-Attention-Based Deep Learning Model

Protein structure prediction (PSP) is a vital challenge in bioinformatics, structural biology and drug discovery. Protein secondary structure (SS) prediction is critical since three-dimensional (3D) structures are primarily made up of secondary structures. With the advancement of deep learning appro...

Full description

Saved in:

Bibliographic Details
Main Authors:	M. M. Mohamed Mufassirin, M. A. Hakim Newton, Julia Rahman, Abdul Sattar
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Deep learning convolutional neural network protein structure prediction protein secondary structure recurrent neural network
Online Access:	https://ieeexplore.ieee.org/document/10143539/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846128600337612800
author	M. M. Mohamed Mufassirin M. A. Hakim Newton Julia Rahman Abdul Sattar
author_facet	M. M. Mohamed Mufassirin M. A. Hakim Newton Julia Rahman Abdul Sattar
author_sort	M. M. Mohamed Mufassirin
collection	DOAJ
description	Protein structure prediction (PSP) is a vital challenge in bioinformatics, structural biology and drug discovery. Protein secondary structure (SS) prediction is critical since three-dimensional (3D) structures are primarily made up of secondary structures. With the advancement of deep learning approaches, SS classification accuracy has been significantly improved. Many existing methods use an ensemble of complex neural networks to improve SS prediction. Because of the high dimensionality of the hyperparameter space, deep neural networks with complex architectures are typically challenging to train effectively. Also, predicting secondary structures in the boundary regions between different types of SS is challenging. This study presents Multi-S3P, which employs bidirectional Long-Short-Term-Memory (BILSTM) and Convolutional Neural Networks (CNN) with a self-attention mechanism to improve the secondary structure prediction using an effective training strategy to capture the unique characteristics of each type of secondary structure and combine them more effectively. The ensemble of CNN and BILSTM can learn both contextual information and long-range interactions between the residues. In addition, using a self-attention mechanism allows the model to focus on the most important features for improving performance. We used the SPOT-1D dataset for the training and validation of our model using a set of four input features derived from amino acid sequences. Further, the model was tested on four popular independent test datasets and compared with various state-of-the-art predictors. The presented results show that Multi-S3P outperformed the other methods in terms of Q3, Q8 accuracy and other performance metrics, achieving the highest Q3 accuracy of 87.57% and a Q8 accuracy of 77.56% on the TEST2016 test set. More importantly, Multi-S3P demonstrates high performance in SS boundary regions. Our experiment also demonstrates that the combination of different input features and a multi-network-based training strategy significantly improved the performance.
format	Article
id	doaj-art-4b1fb70b679e42c49606635c4e9d880f
institution	Kabale University
issn	2169-3536
language	English
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-4b1fb70b679e42c49606635c4e9d880f2024-12-11T00:01:00ZengIEEEIEEE Access2169-35362023-01-0111570835709610.1109/ACCESS.2023.328270210143539Multi-S3P: Protein Secondary Structure Prediction With Specialized Multi-Network and Self-Attention-Based Deep Learning ModelM. M. Mohamed Mufassirin0https://orcid.org/0000-0002-3141-7023M. A. Hakim Newton1Julia Rahman2https://orcid.org/0000-0001-5005-9922Abdul Sattar3https://orcid.org/0000-0002-2567-2052School of Information and Communication Technology, Griffith University, Nathan, QLD, AustraliaInstitute for Integrated and Intelligent Systems, Griffith University, Nathan, QLD, AustraliaSchool of Information and Communication Technology, Griffith University, Nathan, QLD, AustraliaSchool of Information and Communication Technology, Griffith University, Nathan, QLD, AustraliaProtein structure prediction (PSP) is a vital challenge in bioinformatics, structural biology and drug discovery. Protein secondary structure (SS) prediction is critical since three-dimensional (3D) structures are primarily made up of secondary structures. With the advancement of deep learning approaches, SS classification accuracy has been significantly improved. Many existing methods use an ensemble of complex neural networks to improve SS prediction. Because of the high dimensionality of the hyperparameter space, deep neural networks with complex architectures are typically challenging to train effectively. Also, predicting secondary structures in the boundary regions between different types of SS is challenging. This study presents Multi-S3P, which employs bidirectional Long-Short-Term-Memory (BILSTM) and Convolutional Neural Networks (CNN) with a self-attention mechanism to improve the secondary structure prediction using an effective training strategy to capture the unique characteristics of each type of secondary structure and combine them more effectively. The ensemble of CNN and BILSTM can learn both contextual information and long-range interactions between the residues. In addition, using a self-attention mechanism allows the model to focus on the most important features for improving performance. We used the SPOT-1D dataset for the training and validation of our model using a set of four input features derived from amino acid sequences. Further, the model was tested on four popular independent test datasets and compared with various state-of-the-art predictors. The presented results show that Multi-S3P outperformed the other methods in terms of Q3, Q8 accuracy and other performance metrics, achieving the highest Q3 accuracy of 87.57% and a Q8 accuracy of 77.56% on the TEST2016 test set. More importantly, Multi-S3P demonstrates high performance in SS boundary regions. Our experiment also demonstrates that the combination of different input features and a multi-network-based training strategy significantly improved the performance.https://ieeexplore.ieee.org/document/10143539/Deep learningconvolutional neural networkprotein structure predictionprotein secondary structurerecurrent neural network
spellingShingle	M. M. Mohamed Mufassirin M. A. Hakim Newton Julia Rahman Abdul Sattar Multi-S3P: Protein Secondary Structure Prediction With Specialized Multi-Network and Self-Attention-Based Deep Learning Model IEEE Access Deep learning convolutional neural network protein structure prediction protein secondary structure recurrent neural network
title	Multi-S3P: Protein Secondary Structure Prediction With Specialized Multi-Network and Self-Attention-Based Deep Learning Model
title_full	Multi-S3P: Protein Secondary Structure Prediction With Specialized Multi-Network and Self-Attention-Based Deep Learning Model
title_fullStr	Multi-S3P: Protein Secondary Structure Prediction With Specialized Multi-Network and Self-Attention-Based Deep Learning Model
title_full_unstemmed	Multi-S3P: Protein Secondary Structure Prediction With Specialized Multi-Network and Self-Attention-Based Deep Learning Model
title_short	Multi-S3P: Protein Secondary Structure Prediction With Specialized Multi-Network and Self-Attention-Based Deep Learning Model
title_sort	multi s3p protein secondary structure prediction with specialized multi network and self attention based deep learning model
topic	Deep learning convolutional neural network protein structure prediction protein secondary structure recurrent neural network
url	https://ieeexplore.ieee.org/document/10143539/
work_keys_str_mv	AT mmmohamedmufassirin multis3pproteinsecondarystructurepredictionwithspecializedmultinetworkandselfattentionbaseddeeplearningmodel AT mahakimnewton multis3pproteinsecondarystructurepredictionwithspecializedmultinetworkandselfattentionbaseddeeplearningmodel AT juliarahman multis3pproteinsecondarystructurepredictionwithspecializedmultinetworkandselfattentionbaseddeeplearningmodel AT abdulsattar multis3pproteinsecondarystructurepredictionwithspecializedmultinetworkandselfattentionbaseddeeplearningmodel

Multi-S3P: Protein Secondary Structure Prediction With Specialized Multi-Network and Self-Attention-Based Deep Learning Model

Similar Items