A novel RFE-GRU model for diabetes classification using PIMA Indian dataset
Abstract Diabetes is a long-term condition characterized by elevated blood sugar levels. It can lead to a variety of complex disorders such as stroke, renal failure, and heart attack. Diabetes requires the most machine learning help to diagnose diabetes illness at an early stage, as it cannot be tre...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-024-82420-9 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841544798000906240 |
---|---|
author | Mahmoud Y. Shams Zahraa Tarek Ahmed M. Elshewey |
author_facet | Mahmoud Y. Shams Zahraa Tarek Ahmed M. Elshewey |
author_sort | Mahmoud Y. Shams |
collection | DOAJ |
description | Abstract Diabetes is a long-term condition characterized by elevated blood sugar levels. It can lead to a variety of complex disorders such as stroke, renal failure, and heart attack. Diabetes requires the most machine learning help to diagnose diabetes illness at an early stage, as it cannot be treated and adds significant complications to our health-care system. The diabetes PIMA Indian dataset (PIDD) was used for classification in several studies, it includes 768 instances and 9 features; eight of the features are the predictors, and one feature is the target. Firstly, we performed the preprocessing stage that includes mean imputation and data normalization. Afterwards, we trained the extracted features using various types of Machine Learning (ML); Random Forest (RF), Logistic Regression (LR), K-Nearest neighbor (KNN), Naïve Bayes (NB), Histogram Gradient Boost (HGB), and Gated Recurrent Unit (GRU) models. To achieve the classification for the PIDD, a new model called Recursive Feature Elimination-GRU (RFE-GRU) is proposed in this paper. RFE is vital for selecting features in the training dataset that are most important in predicting the target variable. While the GRU handles the challenge of vanishing and inflating gradient of the features results from RFE. Several predictive evaluation metrics, including precision, recall, F1-score, accuracy, and Area Under the Curve (AUC) achieved 90.50%, 90.70%, 90.50%, 90.70%, 0.9278, respectively, to verify and validate the execution of the RFE-GRU model. The comparative results showed that the RFE-GRU model is better than other classification models. |
format | Article |
id | doaj-art-13769b4fb7904e50ab99578019795d95 |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-13769b4fb7904e50ab99578019795d952025-01-12T12:16:13ZengNature PortfolioScientific Reports2045-23222025-01-0115112210.1038/s41598-024-82420-9A novel RFE-GRU model for diabetes classification using PIMA Indian datasetMahmoud Y. Shams0Zahraa Tarek1Ahmed M. Elshewey2Faculty of Artificial Intelligence, Kafrelsheikh UniversityFaculty of Computers and Information, Computer Science Department, Mansoura UniversityDepartment of Computer Science, Faculty of Computers and Information, Suez UniversityAbstract Diabetes is a long-term condition characterized by elevated blood sugar levels. It can lead to a variety of complex disorders such as stroke, renal failure, and heart attack. Diabetes requires the most machine learning help to diagnose diabetes illness at an early stage, as it cannot be treated and adds significant complications to our health-care system. The diabetes PIMA Indian dataset (PIDD) was used for classification in several studies, it includes 768 instances and 9 features; eight of the features are the predictors, and one feature is the target. Firstly, we performed the preprocessing stage that includes mean imputation and data normalization. Afterwards, we trained the extracted features using various types of Machine Learning (ML); Random Forest (RF), Logistic Regression (LR), K-Nearest neighbor (KNN), Naïve Bayes (NB), Histogram Gradient Boost (HGB), and Gated Recurrent Unit (GRU) models. To achieve the classification for the PIDD, a new model called Recursive Feature Elimination-GRU (RFE-GRU) is proposed in this paper. RFE is vital for selecting features in the training dataset that are most important in predicting the target variable. While the GRU handles the challenge of vanishing and inflating gradient of the features results from RFE. Several predictive evaluation metrics, including precision, recall, F1-score, accuracy, and Area Under the Curve (AUC) achieved 90.50%, 90.70%, 90.50%, 90.70%, 0.9278, respectively, to verify and validate the execution of the RFE-GRU model. The comparative results showed that the RFE-GRU model is better than other classification models.https://doi.org/10.1038/s41598-024-82420-9Diabetes classificationMachine learningRecursive feature elimination (RFE)Gated recurrent unit (GRU)KNN |
spellingShingle | Mahmoud Y. Shams Zahraa Tarek Ahmed M. Elshewey A novel RFE-GRU model for diabetes classification using PIMA Indian dataset Scientific Reports Diabetes classification Machine learning Recursive feature elimination (RFE) Gated recurrent unit (GRU) KNN |
title | A novel RFE-GRU model for diabetes classification using PIMA Indian dataset |
title_full | A novel RFE-GRU model for diabetes classification using PIMA Indian dataset |
title_fullStr | A novel RFE-GRU model for diabetes classification using PIMA Indian dataset |
title_full_unstemmed | A novel RFE-GRU model for diabetes classification using PIMA Indian dataset |
title_short | A novel RFE-GRU model for diabetes classification using PIMA Indian dataset |
title_sort | novel rfe gru model for diabetes classification using pima indian dataset |
topic | Diabetes classification Machine learning Recursive feature elimination (RFE) Gated recurrent unit (GRU) KNN |
url | https://doi.org/10.1038/s41598-024-82420-9 |
work_keys_str_mv | AT mahmoudyshams anovelrfegrumodelfordiabetesclassificationusingpimaindiandataset AT zahraatarek anovelrfegrumodelfordiabetesclassificationusingpimaindiandataset AT ahmedmelshewey anovelrfegrumodelfordiabetesclassificationusingpimaindiandataset AT mahmoudyshams novelrfegrumodelfordiabetesclassificationusingpimaindiandataset AT zahraatarek novelrfegrumodelfordiabetesclassificationusingpimaindiandataset AT ahmedmelshewey novelrfegrumodelfordiabetesclassificationusingpimaindiandataset |