Optimizing hypertension prediction using ensemble learning approaches.

Hypertension (HTN) prediction is critical for effective preventive healthcare strategies. This study investigates how well ensemble learning techniques work to increase the accuracy of HTN prediction models. Utilizing a dataset of 612 participants from Ethiopia, which includes 27 features potentiall...

Full description

Saved in:
Bibliographic Details
Main Authors: Isteaq Kabir Sifat, Md Kaderi Kibria
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0315865
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841555480018681856
author Isteaq Kabir Sifat
Md Kaderi Kibria
author_facet Isteaq Kabir Sifat
Md Kaderi Kibria
author_sort Isteaq Kabir Sifat
collection DOAJ
description Hypertension (HTN) prediction is critical for effective preventive healthcare strategies. This study investigates how well ensemble learning techniques work to increase the accuracy of HTN prediction models. Utilizing a dataset of 612 participants from Ethiopia, which includes 27 features potentially associated with HTN risk, we aimed to enhance predictive performance over traditional single-model methods. A multi-faceted feature selection approach was employed, incorporating Boruta, Lasso Regression, Forward and Backward Selection, and Random Forest feature importance, and found 13 common features that were considered for prediction. Five machine learning (ML) models such as logistic regression (LR), artificial neural network (ANN), random forest (RF), extreme gradient boosting (XGB), light gradient boosting machine (LGBM), and a stacking ensemble model were trained using selected features to predict HTN. The models' performance on the testing set was evaluated using accuracy, precision, recall, F1-score, and area under the curve (AUC). Additionally, SHapley Additive exPlanations (SHAP) was utilized to examine the impact of individual features on the models' predictions and identify the most important risk factors for HTN. The stacking ensemble model emerged as the most effective approach for predicting HTN risk, achieving an accuracy of 96.32%, precision of 95.48%, recall of 97.51%, F1-score of 96.48%, and an AUC of 0.971. SHAP analysis of the stacking model identified weight, drinking habits, history of hypertension, salt intake, age, diabetes, BMI, and fat intake as the most significant and interpretable risk factors for HTN. Our results demonstrate significant advancements in predictive accuracy and robustness, highlighting the potential of ensemble learning as a pivotal tool in healthcare analytics. This research contributes to ongoing efforts to optimize HTN prediction models, ultimately supporting early intervention and personalized healthcare management.
format Article
id doaj-art-ad0c4e6c80d444bb9020b69ef1d79cd8
institution Kabale University
issn 1932-6203
language English
publishDate 2024-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-ad0c4e6c80d444bb9020b69ef1d79cd82025-01-08T05:32:39ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-011912e031586510.1371/journal.pone.0315865Optimizing hypertension prediction using ensemble learning approaches.Isteaq Kabir SifatMd Kaderi KibriaHypertension (HTN) prediction is critical for effective preventive healthcare strategies. This study investigates how well ensemble learning techniques work to increase the accuracy of HTN prediction models. Utilizing a dataset of 612 participants from Ethiopia, which includes 27 features potentially associated with HTN risk, we aimed to enhance predictive performance over traditional single-model methods. A multi-faceted feature selection approach was employed, incorporating Boruta, Lasso Regression, Forward and Backward Selection, and Random Forest feature importance, and found 13 common features that were considered for prediction. Five machine learning (ML) models such as logistic regression (LR), artificial neural network (ANN), random forest (RF), extreme gradient boosting (XGB), light gradient boosting machine (LGBM), and a stacking ensemble model were trained using selected features to predict HTN. The models' performance on the testing set was evaluated using accuracy, precision, recall, F1-score, and area under the curve (AUC). Additionally, SHapley Additive exPlanations (SHAP) was utilized to examine the impact of individual features on the models' predictions and identify the most important risk factors for HTN. The stacking ensemble model emerged as the most effective approach for predicting HTN risk, achieving an accuracy of 96.32%, precision of 95.48%, recall of 97.51%, F1-score of 96.48%, and an AUC of 0.971. SHAP analysis of the stacking model identified weight, drinking habits, history of hypertension, salt intake, age, diabetes, BMI, and fat intake as the most significant and interpretable risk factors for HTN. Our results demonstrate significant advancements in predictive accuracy and robustness, highlighting the potential of ensemble learning as a pivotal tool in healthcare analytics. This research contributes to ongoing efforts to optimize HTN prediction models, ultimately supporting early intervention and personalized healthcare management.https://doi.org/10.1371/journal.pone.0315865
spellingShingle Isteaq Kabir Sifat
Md Kaderi Kibria
Optimizing hypertension prediction using ensemble learning approaches.
PLoS ONE
title Optimizing hypertension prediction using ensemble learning approaches.
title_full Optimizing hypertension prediction using ensemble learning approaches.
title_fullStr Optimizing hypertension prediction using ensemble learning approaches.
title_full_unstemmed Optimizing hypertension prediction using ensemble learning approaches.
title_short Optimizing hypertension prediction using ensemble learning approaches.
title_sort optimizing hypertension prediction using ensemble learning approaches
url https://doi.org/10.1371/journal.pone.0315865
work_keys_str_mv AT isteaqkabirsifat optimizinghypertensionpredictionusingensemblelearningapproaches
AT mdkaderikibria optimizinghypertensionpredictionusingensemblelearningapproaches