Use of Machine Learning to Predict the Incidence of Type 2 Diabetes Among Relatively Healthy Adults: A 10-Year Longitudinal Study in Taiwan
<b>Background</b>: The prevalence of diabetes is increasing worldwide, particularly in the Pacific Ocean island nations. Although machine learning (ML) models and data mining approaches have been applied to diabetes research, there was no study utilizing ML models to predict diabetes inc...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-12-01
|
Series: | Diagnostics |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-4418/15/1/72 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841549259650891776 |
---|---|
author | Ying-Qiang Liu Tzu-Wei Chang Lung-Chun Lee Chia-Yu Chen Pi-Shan Hsu Yu-Tse Tsan Chao-Tung Yang Wei-Min Chu |
author_facet | Ying-Qiang Liu Tzu-Wei Chang Lung-Chun Lee Chia-Yu Chen Pi-Shan Hsu Yu-Tse Tsan Chao-Tung Yang Wei-Min Chu |
author_sort | Ying-Qiang Liu |
collection | DOAJ |
description | <b>Background</b>: The prevalence of diabetes is increasing worldwide, particularly in the Pacific Ocean island nations. Although machine learning (ML) models and data mining approaches have been applied to diabetes research, there was no study utilizing ML models to predict diabetes incidence in Taiwan. We aimed to predict the onset of diabetes in order to raise health awareness, thereby promoting any necessary lifestyle modifications and help mitigate disease burden. <b>Methods</b>: The research dataset used in the study was retrieved from the Clinical Data Center of Taichung Veterans General Hospital. We collected data from the available electronic health records with a total of 33 items being employed for model construction. Individuals with diabetes and those with missing data were excluded. Ultimately, 6687 adults were included in the final analysis, where we implemented three different ML algorithms, including logistic regression (LR), random forest (RF) and extreme gradient boosting (XGBoost) in order to predict diabetes. <b>Results</b>: The top five important factors involved in the prediction model were glycated hemoglobin (HbA1c), fasting blood glucose, weight, free thyroxine (fT4), and triglycerides (TG). Notably, random forest, logistic regression, and XGBoost reached 99%, 99%, and 98% accuracy, respectively. fT4 seems to be one of the significant features in predicting the onset of diabetes. Moreover, this would be the first study using machine learning models to predict diabetes that has demonstrated the importance of thyroid hormone. <b>Conclusions</b>: A total of 33 items were able to be put into the machine learning model in order to predict diabetes with promising accuracy. In comparison to prior studies on machine learning models, this study not only identified similar key factors for predicting diabetes but also highlighted the significance of thyroid hormones, a factor that was previously overlooked. Moreover, it highlighted the relevance of predicting type 2 diabetes using more affordable methods, which would be useful for clinical healthcare professionals and endocrinologists who apply the models to clinical practice. |
format | Article |
id | doaj-art-397980418f5a4282897494981ad75d68 |
institution | Kabale University |
issn | 2075-4418 |
language | English |
publishDate | 2024-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Diagnostics |
spelling | doaj-art-397980418f5a4282897494981ad75d682025-01-10T13:16:38ZengMDPI AGDiagnostics2075-44182024-12-011517210.3390/diagnostics15010072Use of Machine Learning to Predict the Incidence of Type 2 Diabetes Among Relatively Healthy Adults: A 10-Year Longitudinal Study in TaiwanYing-Qiang Liu0Tzu-Wei Chang1Lung-Chun Lee2Chia-Yu Chen3Pi-Shan Hsu4Yu-Tse Tsan5Chao-Tung Yang6Wei-Min Chu7Department of Medical Education, Taichung Veterans General Hospital, Taichung 407219, TaiwanDepartment of Family Medicine, Taichung Veterans General Hospital, Taichung 407219, TaiwanDepartment of Family Medicine, Taichung Veterans General Hospital, Taichung 407219, TaiwanDepartment of Application Value-Added Service, SYSTEX Corporation, Taipei 114730, TaiwanDepartment of Family Medicine, Taichung Veterans General Hospital, Taichung 407219, TaiwanDivision of Occupational Medicine, Department of Emergency Medicine, Taichung Veterans General Hospital, Taichung 407219, TaiwanDepartment of Computer Science, Tunghai University, Taichung 407224, TaiwanDepartment of Family Medicine, Taichung Veterans General Hospital, Taichung 407219, Taiwan<b>Background</b>: The prevalence of diabetes is increasing worldwide, particularly in the Pacific Ocean island nations. Although machine learning (ML) models and data mining approaches have been applied to diabetes research, there was no study utilizing ML models to predict diabetes incidence in Taiwan. We aimed to predict the onset of diabetes in order to raise health awareness, thereby promoting any necessary lifestyle modifications and help mitigate disease burden. <b>Methods</b>: The research dataset used in the study was retrieved from the Clinical Data Center of Taichung Veterans General Hospital. We collected data from the available electronic health records with a total of 33 items being employed for model construction. Individuals with diabetes and those with missing data were excluded. Ultimately, 6687 adults were included in the final analysis, where we implemented three different ML algorithms, including logistic regression (LR), random forest (RF) and extreme gradient boosting (XGBoost) in order to predict diabetes. <b>Results</b>: The top five important factors involved in the prediction model were glycated hemoglobin (HbA1c), fasting blood glucose, weight, free thyroxine (fT4), and triglycerides (TG). Notably, random forest, logistic regression, and XGBoost reached 99%, 99%, and 98% accuracy, respectively. fT4 seems to be one of the significant features in predicting the onset of diabetes. Moreover, this would be the first study using machine learning models to predict diabetes that has demonstrated the importance of thyroid hormone. <b>Conclusions</b>: A total of 33 items were able to be put into the machine learning model in order to predict diabetes with promising accuracy. In comparison to prior studies on machine learning models, this study not only identified similar key factors for predicting diabetes but also highlighted the significance of thyroid hormones, a factor that was previously overlooked. Moreover, it highlighted the relevance of predicting type 2 diabetes using more affordable methods, which would be useful for clinical healthcare professionals and endocrinologists who apply the models to clinical practice.https://www.mdpi.com/2075-4418/15/1/72machine learning modelsdiabetesfree thyroxineglycated hemoglobinfasting blood glucoseweight |
spellingShingle | Ying-Qiang Liu Tzu-Wei Chang Lung-Chun Lee Chia-Yu Chen Pi-Shan Hsu Yu-Tse Tsan Chao-Tung Yang Wei-Min Chu Use of Machine Learning to Predict the Incidence of Type 2 Diabetes Among Relatively Healthy Adults: A 10-Year Longitudinal Study in Taiwan Diagnostics machine learning models diabetes free thyroxine glycated hemoglobin fasting blood glucose weight |
title | Use of Machine Learning to Predict the Incidence of Type 2 Diabetes Among Relatively Healthy Adults: A 10-Year Longitudinal Study in Taiwan |
title_full | Use of Machine Learning to Predict the Incidence of Type 2 Diabetes Among Relatively Healthy Adults: A 10-Year Longitudinal Study in Taiwan |
title_fullStr | Use of Machine Learning to Predict the Incidence of Type 2 Diabetes Among Relatively Healthy Adults: A 10-Year Longitudinal Study in Taiwan |
title_full_unstemmed | Use of Machine Learning to Predict the Incidence of Type 2 Diabetes Among Relatively Healthy Adults: A 10-Year Longitudinal Study in Taiwan |
title_short | Use of Machine Learning to Predict the Incidence of Type 2 Diabetes Among Relatively Healthy Adults: A 10-Year Longitudinal Study in Taiwan |
title_sort | use of machine learning to predict the incidence of type 2 diabetes among relatively healthy adults a 10 year longitudinal study in taiwan |
topic | machine learning models diabetes free thyroxine glycated hemoglobin fasting blood glucose weight |
url | https://www.mdpi.com/2075-4418/15/1/72 |
work_keys_str_mv | AT yingqiangliu useofmachinelearningtopredicttheincidenceoftype2diabetesamongrelativelyhealthyadultsa10yearlongitudinalstudyintaiwan AT tzuweichang useofmachinelearningtopredicttheincidenceoftype2diabetesamongrelativelyhealthyadultsa10yearlongitudinalstudyintaiwan AT lungchunlee useofmachinelearningtopredicttheincidenceoftype2diabetesamongrelativelyhealthyadultsa10yearlongitudinalstudyintaiwan AT chiayuchen useofmachinelearningtopredicttheincidenceoftype2diabetesamongrelativelyhealthyadultsa10yearlongitudinalstudyintaiwan AT pishanhsu useofmachinelearningtopredicttheincidenceoftype2diabetesamongrelativelyhealthyadultsa10yearlongitudinalstudyintaiwan AT yutsetsan useofmachinelearningtopredicttheincidenceoftype2diabetesamongrelativelyhealthyadultsa10yearlongitudinalstudyintaiwan AT chaotungyang useofmachinelearningtopredicttheincidenceoftype2diabetesamongrelativelyhealthyadultsa10yearlongitudinalstudyintaiwan AT weiminchu useofmachinelearningtopredicttheincidenceoftype2diabetesamongrelativelyhealthyadultsa10yearlongitudinalstudyintaiwan |