Comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patients

BackgroundSepsis is a life-threatening disease associated with a high mortality rate, emphasizing the need for the exploration of novel models to predict the prognosis of this patient population. This study compared the performance of traditional logistic regression and machine learning models in pr...

Full description

Saved in:
Bibliographic Details
Main Authors: Hongsheng Wu, Biling Liao, Tengfei Ji, Keqiang Ma, Yumei Luo, Shengmin Zhang
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Medicine
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmed.2024.1496869/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841558627826008064
author Hongsheng Wu
Biling Liao
Tengfei Ji
Keqiang Ma
Yumei Luo
Shengmin Zhang
author_facet Hongsheng Wu
Biling Liao
Tengfei Ji
Keqiang Ma
Yumei Luo
Shengmin Zhang
author_sort Hongsheng Wu
collection DOAJ
description BackgroundSepsis is a life-threatening disease associated with a high mortality rate, emphasizing the need for the exploration of novel models to predict the prognosis of this patient population. This study compared the performance of traditional logistic regression and machine learning models in predicting adult sepsis mortality.ObjectiveTo develop an optimum model for predicting the mortality of adult sepsis patients based on comparing traditional logistic regression and machine learning methodology.MethodsRetrospective analysis was conducted on 606 adult sepsis inpatients at our medical center between January 2020 and December 2022, who were randomly divided into training and validation sets in a 7:3 ratio. Traditional logistic regression and machine learning methods were employed to assess the predictive ability of mortality in adult sepsis. Univariate analysis identified independent risk factors for the logistic regression model, while Least Absolute Shrinkage and Selection Operator (LASSO) regression facilitated variable shrinkage and selection for the machine learning model. Among various machine learning models, which included Bagged Tree, Boost Tree, Decision Tree, LightGBM, Naïve Bayes, Nearest Neighbors, Support Vector Machine (SVM), and Random Forest (RF), the one with the maximum area under the curve (AUC) was chosen for model construction. Model validation and comparison with the Sequential Organ Failure Assessment (SOFA) and the Acute Physiology and Chronic Health Evaluation (APACHE) scores were performed using receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis (DCA) curves in the validation set.ResultsUnivariate analysis was employed to assess 17 variables, namely gender, history of coronary heart disease (CHD), systolic pressure, white blood cell (WBC), neutrophil count (NEUT), lymphocyte count (LYMP), lactic acid, neutrophil-to-lymphocyte ratio (NLR), red blood cell distribution width (RDW), interleukin-6 (IL-6), prothrombin time (PT), international normalized ratio (INR), fibrinogen (FBI), D-dimer, aspartate aminotransferase (AST), total bilirubin (Tbil), and lung infection. Significant differences (p < 0.05) between the survival and non-survival groups were observed for these variables. Utilizing stepwise regression with the “backward” method, independent risk factors, including systolic pressure, lactic acid, NLR, RDW, IL-6, PT, and Tbil, were identified. These factors were then incorporated into a logistic regression model, chosen based on the minimum Akaike Information Criterion (AIC) value (98.65). Machine learning techniques were also applied, and the RF model, demonstrating the maximum Area Under the Curve (AUC) of 0.999, was selected. LASSO regression, employing the lambda.1SE criteria, identified systolic pressure, lactic acid, NEUT, RDW, IL6, INR, and Tbil as variables for constructing the RF model, validated through ten-fold cross-validation. For model validation and comparison with traditional logistic models, SOFA, and APACHE scoring.ConclusionBased on deep machine learning principles, the RF model demonstrates advantages over traditional logistic regression models in predicting adult sepsis prognosis. The RF model holds significant potential for clinical surveillance and interventions to enhance outcomes for sepsis patients.
format Article
id doaj-art-26f9fb4508eb43e3b85068b1e1bfe26c
institution Kabale University
issn 2296-858X
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Medicine
spelling doaj-art-26f9fb4508eb43e3b85068b1e1bfe26c2025-01-06T06:59:50ZengFrontiers Media S.A.Frontiers in Medicine2296-858X2025-01-011110.3389/fmed.2024.14968691496869Comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patientsHongsheng WuBiling LiaoTengfei JiKeqiang MaYumei LuoShengmin ZhangBackgroundSepsis is a life-threatening disease associated with a high mortality rate, emphasizing the need for the exploration of novel models to predict the prognosis of this patient population. This study compared the performance of traditional logistic regression and machine learning models in predicting adult sepsis mortality.ObjectiveTo develop an optimum model for predicting the mortality of adult sepsis patients based on comparing traditional logistic regression and machine learning methodology.MethodsRetrospective analysis was conducted on 606 adult sepsis inpatients at our medical center between January 2020 and December 2022, who were randomly divided into training and validation sets in a 7:3 ratio. Traditional logistic regression and machine learning methods were employed to assess the predictive ability of mortality in adult sepsis. Univariate analysis identified independent risk factors for the logistic regression model, while Least Absolute Shrinkage and Selection Operator (LASSO) regression facilitated variable shrinkage and selection for the machine learning model. Among various machine learning models, which included Bagged Tree, Boost Tree, Decision Tree, LightGBM, Naïve Bayes, Nearest Neighbors, Support Vector Machine (SVM), and Random Forest (RF), the one with the maximum area under the curve (AUC) was chosen for model construction. Model validation and comparison with the Sequential Organ Failure Assessment (SOFA) and the Acute Physiology and Chronic Health Evaluation (APACHE) scores were performed using receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis (DCA) curves in the validation set.ResultsUnivariate analysis was employed to assess 17 variables, namely gender, history of coronary heart disease (CHD), systolic pressure, white blood cell (WBC), neutrophil count (NEUT), lymphocyte count (LYMP), lactic acid, neutrophil-to-lymphocyte ratio (NLR), red blood cell distribution width (RDW), interleukin-6 (IL-6), prothrombin time (PT), international normalized ratio (INR), fibrinogen (FBI), D-dimer, aspartate aminotransferase (AST), total bilirubin (Tbil), and lung infection. Significant differences (p < 0.05) between the survival and non-survival groups were observed for these variables. Utilizing stepwise regression with the “backward” method, independent risk factors, including systolic pressure, lactic acid, NLR, RDW, IL-6, PT, and Tbil, were identified. These factors were then incorporated into a logistic regression model, chosen based on the minimum Akaike Information Criterion (AIC) value (98.65). Machine learning techniques were also applied, and the RF model, demonstrating the maximum Area Under the Curve (AUC) of 0.999, was selected. LASSO regression, employing the lambda.1SE criteria, identified systolic pressure, lactic acid, NEUT, RDW, IL6, INR, and Tbil as variables for constructing the RF model, validated through ten-fold cross-validation. For model validation and comparison with traditional logistic models, SOFA, and APACHE scoring.ConclusionBased on deep machine learning principles, the RF model demonstrates advantages over traditional logistic regression models in predicting adult sepsis prognosis. The RF model holds significant potential for clinical surveillance and interventions to enhance outcomes for sepsis patients.https://www.frontiersin.org/articles/10.3389/fmed.2024.1496869/fullmachine learningrandom forestlogistic regressionadult sepsismortality
spellingShingle Hongsheng Wu
Biling Liao
Tengfei Ji
Keqiang Ma
Yumei Luo
Shengmin Zhang
Comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patients
Frontiers in Medicine
machine learning
random forest
logistic regression
adult sepsis
mortality
title Comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patients
title_full Comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patients
title_fullStr Comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patients
title_full_unstemmed Comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patients
title_short Comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patients
title_sort comparison between traditional logistic regression and machine learning for predicting mortality in adult sepsis patients
topic machine learning
random forest
logistic regression
adult sepsis
mortality
url https://www.frontiersin.org/articles/10.3389/fmed.2024.1496869/full
work_keys_str_mv AT hongshengwu comparisonbetweentraditionallogisticregressionandmachinelearningforpredictingmortalityinadultsepsispatients
AT bilingliao comparisonbetweentraditionallogisticregressionandmachinelearningforpredictingmortalityinadultsepsispatients
AT tengfeiji comparisonbetweentraditionallogisticregressionandmachinelearningforpredictingmortalityinadultsepsispatients
AT keqiangma comparisonbetweentraditionallogisticregressionandmachinelearningforpredictingmortalityinadultsepsispatients
AT yumeiluo comparisonbetweentraditionallogisticregressionandmachinelearningforpredictingmortalityinadultsepsispatients
AT shengminzhang comparisonbetweentraditionallogisticregressionandmachinelearningforpredictingmortalityinadultsepsispatients