Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms

BackgroundDepression is being increasingly acknowledged as an important risk factor contributing to coronary heart disease (CHD). Currently, there is no predictive model specifically designed to evaluate the risk of coronary heart disease among individuals with depression. We aim to develop a machin...

Full description

Saved in:
Bibliographic Details
Main Authors: Yicheng Wang, Chuan-Yang Wu, Hui-Xian Fu, Jian-Cheng Zhang
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Cardiovascular Medicine
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fcvm.2024.1504957/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841553931237326848
author Yicheng Wang
Yicheng Wang
Yicheng Wang
Chuan-Yang Wu
Hui-Xian Fu
Jian-Cheng Zhang
Jian-Cheng Zhang
Jian-Cheng Zhang
author_facet Yicheng Wang
Yicheng Wang
Yicheng Wang
Chuan-Yang Wu
Hui-Xian Fu
Jian-Cheng Zhang
Jian-Cheng Zhang
Jian-Cheng Zhang
author_sort Yicheng Wang
collection DOAJ
description BackgroundDepression is being increasingly acknowledged as an important risk factor contributing to coronary heart disease (CHD). Currently, there is no predictive model specifically designed to evaluate the risk of coronary heart disease among individuals with depression. We aim to develop a machine learning (ML) model that will analyze risk factors and forecast the probability of coronary heart disease in individuals suffering from depression.MethodsThis research employed data from the National Health and Nutrition Examination Survey (NHANES) from 2007–2018, which included 2,085 individuals who had previously been diagnosed with depression. The population was randomly divided into a training set and a validation set, with an 8:2 ratio. Univariate and multivariate logistic regression analyses were employed to identify independent risk factors for coronary heart disease in individuals with depression. Eight machine learning algorithms were applied to the training set to construct the model, including logistic regression (LR), random forest (RF), gradient boosting machine (GBM), support vector machine (SVM), extreme gradient boosting (XGBoost), classification and regression tree (CART), k-nearest neighbors (KNN), and neural network (NNET). The validation set are used to evaluate the various performances of eight machine learning models. Several evaluation metrics were employed to assess and compare the performance of eight different machine learning models, aiming to identify the most effective algorithm for predicting coronary heart disease risk in individuals with depression. The evaluation metrics applied in this study included the area under the receiver operating characteristic (ROC) curve, calibration curve, Brier scores, decision curve analysis (DCA), and the precision-recall (PR) curve. And internally validated by the bootstrap method.ResultsUnivariate and multivariate logistic regression analyses identified age, chest pain status, history of myocardial infarction, serum triglyceride levels, and education level as independent predictors of coronary heart disease risk. Eight machine learning algorithms are applied to construct the models, among which the Random Forest model has the best performance, with an (Area Under Curve) AUC of 0.987 for the random forest model in the training set, and an AUC of 0.848 for the PR curve. In the validation set, the random forest model achieves an AUC of 0.996, and an AUC of 0.960 for the PR curve, which demonstrates an excellent discriminative ability. Calibration curves indicated high congruence between observed and predicted odds, with minimal Brier scores of 0.026 and 0.021 for the training, respectively, reinforcing the model's ability to discriminate. Set and validation set, respectively, reinforcing the model's predictive accuracy. DCA curves confirmed net benefits of the random forest model across. Furthermore, the AUC of the random forest model was 0.928 after internal validation by bootstrap method, indicating that its discriminative ability is good, and the model is useful for clinical assessment of the risk of coronary heart disease in depressed people.ConclusionThe random forest algorithm exhibited the best predictive performance, potentially aiding clinicians in assessing the risk probabilities of coronary heart disease within this population.
format Article
id doaj-art-eedb7664cfdd4dd4a1561e1ec0d00165
institution Kabale University
issn 2297-055X
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Cardiovascular Medicine
spelling doaj-art-eedb7664cfdd4dd4a1561e1ec0d001652025-01-09T06:10:39ZengFrontiers Media S.A.Frontiers in Cardiovascular Medicine2297-055X2025-01-011110.3389/fcvm.2024.15049571504957Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithmsYicheng Wang0Yicheng Wang1Yicheng Wang2Chuan-Yang Wu3Hui-Xian Fu4Jian-Cheng Zhang5Jian-Cheng Zhang6Jian-Cheng Zhang7Shengli Clinical Medical College of Fujian Medical University, Fujian Medical University, Fuzhou, Fujian, ChinaDepartment of Cardiovascular Medicine, Fuzhou University Affiliated Provincial Hospital, Fuzhou, Fujian, ChinaDepartment of Cardiology, Fujian Provincial Hospital, Fuzhou, Fujian, ChinaDepartment of Cardiology, Youxi County General Hopital, Sanming, Fujian, ChinaDepartment of Cardiology, Changji Prefecture People’s Hospital in Xinjiang Uygur Autonomous Region, Changji, Xinjiang, ChinaShengli Clinical Medical College of Fujian Medical University, Fujian Medical University, Fuzhou, Fujian, ChinaDepartment of Cardiovascular Medicine, Fuzhou University Affiliated Provincial Hospital, Fuzhou, Fujian, ChinaDepartment of Cardiology, Fujian Provincial Hospital, Fuzhou, Fujian, ChinaBackgroundDepression is being increasingly acknowledged as an important risk factor contributing to coronary heart disease (CHD). Currently, there is no predictive model specifically designed to evaluate the risk of coronary heart disease among individuals with depression. We aim to develop a machine learning (ML) model that will analyze risk factors and forecast the probability of coronary heart disease in individuals suffering from depression.MethodsThis research employed data from the National Health and Nutrition Examination Survey (NHANES) from 2007–2018, which included 2,085 individuals who had previously been diagnosed with depression. The population was randomly divided into a training set and a validation set, with an 8:2 ratio. Univariate and multivariate logistic regression analyses were employed to identify independent risk factors for coronary heart disease in individuals with depression. Eight machine learning algorithms were applied to the training set to construct the model, including logistic regression (LR), random forest (RF), gradient boosting machine (GBM), support vector machine (SVM), extreme gradient boosting (XGBoost), classification and regression tree (CART), k-nearest neighbors (KNN), and neural network (NNET). The validation set are used to evaluate the various performances of eight machine learning models. Several evaluation metrics were employed to assess and compare the performance of eight different machine learning models, aiming to identify the most effective algorithm for predicting coronary heart disease risk in individuals with depression. The evaluation metrics applied in this study included the area under the receiver operating characteristic (ROC) curve, calibration curve, Brier scores, decision curve analysis (DCA), and the precision-recall (PR) curve. And internally validated by the bootstrap method.ResultsUnivariate and multivariate logistic regression analyses identified age, chest pain status, history of myocardial infarction, serum triglyceride levels, and education level as independent predictors of coronary heart disease risk. Eight machine learning algorithms are applied to construct the models, among which the Random Forest model has the best performance, with an (Area Under Curve) AUC of 0.987 for the random forest model in the training set, and an AUC of 0.848 for the PR curve. In the validation set, the random forest model achieves an AUC of 0.996, and an AUC of 0.960 for the PR curve, which demonstrates an excellent discriminative ability. Calibration curves indicated high congruence between observed and predicted odds, with minimal Brier scores of 0.026 and 0.021 for the training, respectively, reinforcing the model's ability to discriminate. Set and validation set, respectively, reinforcing the model's predictive accuracy. DCA curves confirmed net benefits of the random forest model across. Furthermore, the AUC of the random forest model was 0.928 after internal validation by bootstrap method, indicating that its discriminative ability is good, and the model is useful for clinical assessment of the risk of coronary heart disease in depressed people.ConclusionThe random forest algorithm exhibited the best predictive performance, potentially aiding clinicians in assessing the risk probabilities of coronary heart disease within this population.https://www.frontiersin.org/articles/10.3389/fcvm.2024.1504957/fulldepressionmachine learningprediction modelcoronary heart diseaseNational Health and Nutrition Examination Survey (NHANES)
spellingShingle Yicheng Wang
Yicheng Wang
Yicheng Wang
Chuan-Yang Wu
Hui-Xian Fu
Jian-Cheng Zhang
Jian-Cheng Zhang
Jian-Cheng Zhang
Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms
Frontiers in Cardiovascular Medicine
depression
machine learning
prediction model
coronary heart disease
National Health and Nutrition Examination Survey (NHANES)
title Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms
title_full Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms
title_fullStr Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms
title_full_unstemmed Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms
title_short Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms
title_sort development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms
topic depression
machine learning
prediction model
coronary heart disease
National Health and Nutrition Examination Survey (NHANES)
url https://www.frontiersin.org/articles/10.3389/fcvm.2024.1504957/full
work_keys_str_mv AT yichengwang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms
AT yichengwang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms
AT yichengwang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms
AT chuanyangwu developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms
AT huixianfu developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms
AT jianchengzhang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms
AT jianchengzhang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms
AT jianchengzhang developmentandvalidationofapredictionmodelforcoronaryheartdiseaseriskindepressedpatientsaged20yearsandolderusingmachinelearningalgorithms