Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms
BackgroundDepression is being increasingly acknowledged as an important risk factor contributing to coronary heart disease (CHD). Currently, there is no predictive model specifically designed to evaluate the risk of coronary heart disease among individuals with depression. We aim to develop a machin...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2025-01-01
|
Series: | Frontiers in Cardiovascular Medicine |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fcvm.2024.1504957/full |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841553931237326848 |
---|---|
author | Yicheng Wang Yicheng Wang Yicheng Wang Chuan-Yang Wu Hui-Xian Fu Jian-Cheng Zhang Jian-Cheng Zhang Jian-Cheng Zhang |
author_facet | Yicheng Wang Yicheng Wang Yicheng Wang Chuan-Yang Wu Hui-Xian Fu Jian-Cheng Zhang Jian-Cheng Zhang Jian-Cheng Zhang |
author_sort | Yicheng Wang |
collection | DOAJ |
description | BackgroundDepression is being increasingly acknowledged as an important risk factor contributing to coronary heart disease (CHD). Currently, there is no predictive model specifically designed to evaluate the risk of coronary heart disease among individuals with depression. We aim to develop a machine learning (ML) model that will analyze risk factors and forecast the probability of coronary heart disease in individuals suffering from depression.MethodsThis research employed data from the National Health and Nutrition Examination Survey (NHANES) from 2007–2018, which included 2,085 individuals who had previously been diagnosed with depression. The population was randomly divided into a training set and a validation set, with an 8:2 ratio. Univariate and multivariate logistic regression analyses were employed to identify independent risk factors for coronary heart disease in individuals with depression. Eight machine learning algorithms were applied to the training set to construct the model, including logistic regression (LR), random forest (RF), gradient boosting machine (GBM), support vector machine (SVM), extreme gradient boosting (XGBoost), classification and regression tree (CART), k-nearest neighbors (KNN), and neural network (NNET). The validation set are used to evaluate the various performances of eight machine learning models. Several evaluation metrics were employed to assess and compare the performance of eight different machine learning models, aiming to identify the most effective algorithm for predicting coronary heart disease risk in individuals with depression. The evaluation metrics applied in this study included the area under the receiver operating characteristic (ROC) curve, calibration curve, Brier scores, decision curve analysis (DCA), and the precision-recall (PR) curve. And internally validated by the bootstrap method.ResultsUnivariate and multivariate logistic regression analyses identified age, chest pain status, history of myocardial infarction, serum triglyceride levels, and education level as independent predictors of coronary heart disease risk. Eight machine learning algorithms are applied to construct the models, among which the Random Forest model has the best performance, with an (Area Under Curve) AUC of 0.987 for the random forest model in the training set, and an AUC of 0.848 for the PR curve. In the validation set, the random forest model achieves an AUC of 0.996, and an AUC of 0.960 for the PR curve, which demonstrates an excellent discriminative ability. Calibration curves indicated high congruence between observed and predicted odds, with minimal Brier scores of 0.026 and 0.021 for the training, respectively, reinforcing the model's ability to discriminate. Set and validation set, respectively, reinforcing the model's predictive accuracy. DCA curves confirmed net benefits of the random forest model across. Furthermore, the AUC of the random forest model was 0.928 after internal validation by bootstrap method, indicating that its discriminative ability is good, and the model is useful for clinical assessment of the risk of coronary heart disease in depressed people.ConclusionThe random forest algorithm exhibited the best predictive performance, potentially aiding clinicians in assessing the risk probabilities of coronary heart disease within this population. |
format | Article |
id | doaj-art-eedb7664cfdd4dd4a1561e1ec0d00165 |
institution | Kabale University |
issn | 2297-055X |
language | English |
publishDate | 2025-01-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Cardiovascular Medicine |
spelling | doaj-art-eedb7664cfdd4dd4a1561e1ec0d001652025-01-09T06:10:39ZengFrontiers Media S.A.Frontiers in Cardiovascular Medicine2297-055X2025-01-011110.3389/fcvm.2024.15049571504957Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithmsYicheng Wang0Yicheng Wang1Yicheng Wang2Chuan-Yang Wu3Hui-Xian Fu4Jian-Cheng Zhang5Jian-Cheng Zhang6Jian-Cheng Zhang7Shengli Clinical Medical College of Fujian Medical University, Fujian Medical University, Fuzhou, Fujian, ChinaDepartment of Cardiovascular Medicine, Fuzhou University Affiliated Provincial Hospital, Fuzhou, Fujian, ChinaDepartment of Cardiology, Fujian Provincial Hospital, Fuzhou, Fujian, ChinaDepartment of Cardiology, Youxi County General Hopital, Sanming, Fujian, ChinaDepartment of Cardiology, Changji Prefecture People’s Hospital in Xinjiang Uygur Autonomous Region, Changji, Xinjiang, ChinaShengli Clinical Medical College of Fujian Medical University, Fujian Medical University, Fuzhou, Fujian, ChinaDepartment of Cardiovascular Medicine, Fuzhou University Affiliated Provincial Hospital, Fuzhou, Fujian, ChinaDepartment of Cardiology, Fujian Provincial Hospital, Fuzhou, Fujian, ChinaBackgroundDepression is being increasingly acknowledged as an important risk factor contributing to coronary heart disease (CHD). Currently, there is no predictive model specifically designed to evaluate the risk of coronary heart disease among individuals with depression. We aim to develop a machine learning (ML) model that will analyze risk factors and forecast the probability of coronary heart disease in individuals suffering from depression.MethodsThis research employed data from the National Health and Nutrition Examination Survey (NHANES) from 2007–2018, which included 2,085 individuals who had previously been diagnosed with depression. The population was randomly divided into a training set and a validation set, with an 8:2 ratio. Univariate and multivariate logistic regression analyses were employed to identify independent risk factors for coronary heart disease in individuals with depression. Eight machine learning algorithms were applied to the training set to construct the model, including logistic regression (LR), random forest (RF), gradient boosting machine (GBM), support vector machine (SVM), extreme gradient boosting (XGBoost), classification and regression tree (CART), k-nearest neighbors (KNN), and neural network (NNET). The validation set are used to evaluate the various performances of eight machine learning models. Several evaluation metrics were employed to assess and compare the performance of eight different machine learning models, aiming to identify the most effective algorithm for predicting coronary heart disease risk in individuals with depression. The evaluation metrics applied in this study included the area under the receiver operating characteristic (ROC) curve, calibration curve, Brier scores, decision curve analysis (DCA), and the precision-recall (PR) curve. And internally validated by the bootstrap method.ResultsUnivariate and multivariate logistic regression analyses identified age, chest pain status, history of myocardial infarction, serum triglyceride levels, and education level as independent predictors of coronary heart disease risk. Eight machine learning algorithms are applied to construct the models, among which the Random Forest model has the best performance, with an (Area Under Curve) AUC of 0.987 for the random forest model in the training set, and an AUC of 0.848 for the PR curve. In the validation set, the random forest model achieves an AUC of 0.996, and an AUC of 0.960 for the PR curve, which demonstrates an excellent discriminative ability. Calibration curves indicated high congruence between observed and predicted odds, with minimal Brier scores of 0.026 and 0.021 for the training, respectively, reinforcing the model's ability to discriminate. Set and validation set, respectively, reinforcing the model's predictive accuracy. DCA curves confirmed net benefits of the random forest model across. Furthermore, the AUC of the random forest model was 0.928 after internal validation by bootstrap method, indicating that its discriminative ability is good, and the model is useful for clinical assessment of the risk of coronary heart disease in depressed people.ConclusionThe random forest algorithm exhibited the best predictive performance, potentially aiding clinicians in assessing the risk probabilities of coronary heart disease within this population.https://www.frontiersin.org/articles/10.3389/fcvm.2024.1504957/fulldepressionmachine learningprediction modelcoronary heart diseaseNational Health and Nutrition Examination Survey (NHANES) |
spellingShingle | Yicheng Wang Yicheng Wang Yicheng Wang Chuan-Yang Wu Hui-Xian Fu Jian-Cheng Zhang Jian-Cheng Zhang Jian-Cheng Zhang Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms Frontiers in Cardiovascular Medicine depression machine learning prediction model coronary heart disease National Health and Nutrition Examination Survey (NHANES) |
title | Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms |
title_full | Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms |
title_fullStr | Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms |
title_full_unstemmed | Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms |
title_short | Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms |
title_sort | development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms |
topic | depression machine learning prediction model coronary heart disease National Health and Nutrition Examination Survey (NHANES) |
url | https://www.frontiersin.org/articles/10.3389/fcvm.2024.1504957/full |
work_keys_str_mv | AT yichengwang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms AT yichengwang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms AT yichengwang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms AT chuanyangwu developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms AT huixianfu developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms AT jianchengzhang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms AT jianchengzhang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms AT jianchengzhang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms |