Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction
Diabetes, a significant global health crisis, is primarily driven in India by unhealthy diets and sedentary lifestyles, with rapid urbanization amplifying these effects through convenience-oriented living and limited physical activity opportunities, underscoring the need for advanced preventative st...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-11-01
|
Series: | Bioengineering |
Subjects: | |
Online Access: | https://www.mdpi.com/2306-5354/11/12/1215 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1846105793474068480 |
---|---|
author | Prasant Kumar Mohanty Sharmila Anand John Francis Rabindra Kumar Barik Diptendu Sinha Roy Manob Jyoti Saikia |
author_facet | Prasant Kumar Mohanty Sharmila Anand John Francis Rabindra Kumar Barik Diptendu Sinha Roy Manob Jyoti Saikia |
author_sort | Prasant Kumar Mohanty |
collection | DOAJ |
description | Diabetes, a significant global health crisis, is primarily driven in India by unhealthy diets and sedentary lifestyles, with rapid urbanization amplifying these effects through convenience-oriented living and limited physical activity opportunities, underscoring the need for advanced preventative strategies and technology for effective management. This study integrates Shapley Additive explanations (SHAPs) into ensemble machine learning models to improve the accuracy and efficiency of diabetes predictions. By identifying the most influential features using SHAP, this study examined their role in maintaining high predictive performance while minimizing computational demands. The impact of feature selection on model accuracy was assessed across ten models using three feature sets: all features, the top three influential features, and all except these top three. Models focusing on the top three features achieved superior performance, with the ensemble model attaining a better performance in most of the metrics, outperforming comparable approaches. Notably, excluding these features led to a significant decline in performance, reinforcing their critical influence. These findings validate the effectiveness of targeted feature selection for efficient and robust clinical applications. |
format | Article |
id | doaj-art-cdbd7eda9fe34c868701ce6b8ee8c794 |
institution | Kabale University |
issn | 2306-5354 |
language | English |
publishDate | 2024-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Bioengineering |
spelling | doaj-art-cdbd7eda9fe34c868701ce6b8ee8c7942024-12-27T14:11:30ZengMDPI AGBioengineering2306-53542024-11-011112121510.3390/bioengineering11121215Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes PredictionPrasant Kumar Mohanty0Sharmila Anand John Francis1Rabindra Kumar Barik2Diptendu Sinha Roy3Manob Jyoti Saikia4Department of Computer Science and Engineering, National Institute of Technology, Meghalaya 793003, IndiaDepartment of Computer Science, King Khalid University, Abha Campus, Rijal Alma, Abha 61421, Saudi ArabiaSchool of Computer Applications, KIIT Deemed to be University, Bhubaneswar 751024, IndiaDepartment of Computer Science and Engineering, National Institute of Technology, Meghalaya 793003, IndiaBiomedical Sensors & Systems Lab, University of Memphis, Memphis, TN 38152, USADiabetes, a significant global health crisis, is primarily driven in India by unhealthy diets and sedentary lifestyles, with rapid urbanization amplifying these effects through convenience-oriented living and limited physical activity opportunities, underscoring the need for advanced preventative strategies and technology for effective management. This study integrates Shapley Additive explanations (SHAPs) into ensemble machine learning models to improve the accuracy and efficiency of diabetes predictions. By identifying the most influential features using SHAP, this study examined their role in maintaining high predictive performance while minimizing computational demands. The impact of feature selection on model accuracy was assessed across ten models using three feature sets: all features, the top three influential features, and all except these top three. Models focusing on the top three features achieved superior performance, with the ensemble model attaining a better performance in most of the metrics, outperforming comparable approaches. Notably, excluding these features led to a significant decline in performance, reinforcing their critical influence. These findings validate the effectiveness of targeted feature selection for efficient and robust clinical applications.https://www.mdpi.com/2306-5354/11/12/1215diabetes predictioninfluential feature valuesensemble modelsShapley additive explanations |
spellingShingle | Prasant Kumar Mohanty Sharmila Anand John Francis Rabindra Kumar Barik Diptendu Sinha Roy Manob Jyoti Saikia Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction Bioengineering diabetes prediction influential feature values ensemble models Shapley additive explanations |
title | Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction |
title_full | Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction |
title_fullStr | Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction |
title_full_unstemmed | Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction |
title_short | Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction |
title_sort | leveraging shapley additive explanations for feature selection in ensemble models for diabetes prediction |
topic | diabetes prediction influential feature values ensemble models Shapley additive explanations |
url | https://www.mdpi.com/2306-5354/11/12/1215 |
work_keys_str_mv | AT prasantkumarmohanty leveragingshapleyadditiveexplanationsforfeatureselectioninensemblemodelsfordiabetesprediction AT sharmilaanandjohnfrancis leveragingshapleyadditiveexplanationsforfeatureselectioninensemblemodelsfordiabetesprediction AT rabindrakumarbarik leveragingshapleyadditiveexplanationsforfeatureselectioninensemblemodelsfordiabetesprediction AT diptendusinharoy leveragingshapleyadditiveexplanationsforfeatureselectioninensemblemodelsfordiabetesprediction AT manobjyotisaikia leveragingshapleyadditiveexplanationsforfeatureselectioninensemblemodelsfordiabetesprediction |