Machine learning analysis of emerging risk factors for early-onset hypertension in the Tlalpan 2020 cohort

IntroductionHypertension is a significant public health concern. Several relevant risk factors have been identified. However, since it is a complex condition with broad variability and strong dependence on environmental and lifestyle factors, current risk factors only account for a fraction of the o...

Full description

Saved in:
Bibliographic Details
Main Authors: Mireya Martínez-García, Guadalupe O. Gutiérrez-Esparza, Manlio F. Márquez, Luis M. Amezcua-Guerra, Enrique Hernández-Lemus
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Cardiovascular Medicine
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fcvm.2024.1434418/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841525831614070784
author Mireya Martínez-García
Guadalupe O. Gutiérrez-Esparza
Guadalupe O. Gutiérrez-Esparza
Manlio F. Márquez
Manlio F. Márquez
Luis M. Amezcua-Guerra
Enrique Hernández-Lemus
Enrique Hernández-Lemus
author_facet Mireya Martínez-García
Guadalupe O. Gutiérrez-Esparza
Guadalupe O. Gutiérrez-Esparza
Manlio F. Márquez
Manlio F. Márquez
Luis M. Amezcua-Guerra
Enrique Hernández-Lemus
Enrique Hernández-Lemus
author_sort Mireya Martínez-García
collection DOAJ
description IntroductionHypertension is a significant public health concern. Several relevant risk factors have been identified. However, since it is a complex condition with broad variability and strong dependence on environmental and lifestyle factors, current risk factors only account for a fraction of the observed prevalence. This study aims to investigate the emerging early-onset hypertension risk factors using a data-driven approach by implementing machine learning models within a well-established cohort in Mexico City, comprising initially 2,500 healthy adults aged 18 to 50 years.MethodsHypertensive individuals were newly diagnosed during 6,000 person-years, and normotensive individuals were those who, during the same time, remained without exceeding 140 mm Hg in systolic blood pressure and/or diastolic blood pressure of 90 mm Hg. Data on sociodemographic, lifestyle, anthropometric, clinical, and biochemical variables were collected through standardized questionnaires as well as clinical and laboratory assessments. Extreme Gradient Boosting (XGBoost), Logistic Regression (LG) and Support Vector Machines (SVM) were employed to evaluate the relationship between these factors and hypertension risk.ResultsThe Random Forest (RF) Importance Percent was calculated to assess the structural relevance of each variable in the model, while Shapley Additive Explanations (SHAP) analysis quantified both the average impact and direction of each feature on individual predictions. Additionally, odds ratios were calculated to express the size and direction of influence for each variable, and a sex-stratified analysis was conducted to identify any gender-specific risk factors.DiscussionThis nested study provides evidence that sleep disorders, a sedentary lifestyle, consumption of high-fat foods, and energy drinks are potentially modifiable risk factors for hypertension in a Mexico City cohort of young and relatively healthy adults. These findings underscore the importance of addressing these factors in hypertension prevention and management strategies.
format Article
id doaj-art-1cf4c79b28694855b6b339d30453cf4c
institution Kabale University
issn 2297-055X
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Cardiovascular Medicine
spelling doaj-art-1cf4c79b28694855b6b339d30453cf4c2025-01-17T06:50:44ZengFrontiers Media S.A.Frontiers in Cardiovascular Medicine2297-055X2025-01-011110.3389/fcvm.2024.14344181434418Machine learning analysis of emerging risk factors for early-onset hypertension in the Tlalpan 2020 cohortMireya Martínez-García0Guadalupe O. Gutiérrez-Esparza1Guadalupe O. Gutiérrez-Esparza2Manlio F. Márquez3Manlio F. Márquez4Luis M. Amezcua-Guerra5Enrique Hernández-Lemus6Enrique Hernández-Lemus7Department of Immunology, Instituto Nacional de Cardiología Ignacio Chávez, México City, MéxicoInvestigadora por México CONAHCYT Consejo Nacional de Humanidades, Ciencias y Tecnologías, México City, MéxicoDiagnostic and Treatment Division, Instituto Nacional de Cardiología Ignacio Chávez, México City, MéxicoDiagnostic and Treatment Division, Instituto Nacional de Cardiología Ignacio Chávez, México City, MéxicoDepartment of Electrocardiology, Instituto Nacional de Cardiología Ignacio Chávez, México City, MéxicoDepartment of Immunology, Instituto Nacional de Cardiología Ignacio Chávez, México City, MéxicoComputational Genomics Division, Instituto Nacional de Medicina Genómica, México City, MéxicoCenter for Complexity Sciences, Universidad Nacional Autónoma de México, México City, MéxicoIntroductionHypertension is a significant public health concern. Several relevant risk factors have been identified. However, since it is a complex condition with broad variability and strong dependence on environmental and lifestyle factors, current risk factors only account for a fraction of the observed prevalence. This study aims to investigate the emerging early-onset hypertension risk factors using a data-driven approach by implementing machine learning models within a well-established cohort in Mexico City, comprising initially 2,500 healthy adults aged 18 to 50 years.MethodsHypertensive individuals were newly diagnosed during 6,000 person-years, and normotensive individuals were those who, during the same time, remained without exceeding 140 mm Hg in systolic blood pressure and/or diastolic blood pressure of 90 mm Hg. Data on sociodemographic, lifestyle, anthropometric, clinical, and biochemical variables were collected through standardized questionnaires as well as clinical and laboratory assessments. Extreme Gradient Boosting (XGBoost), Logistic Regression (LG) and Support Vector Machines (SVM) were employed to evaluate the relationship between these factors and hypertension risk.ResultsThe Random Forest (RF) Importance Percent was calculated to assess the structural relevance of each variable in the model, while Shapley Additive Explanations (SHAP) analysis quantified both the average impact and direction of each feature on individual predictions. Additionally, odds ratios were calculated to express the size and direction of influence for each variable, and a sex-stratified analysis was conducted to identify any gender-specific risk factors.DiscussionThis nested study provides evidence that sleep disorders, a sedentary lifestyle, consumption of high-fat foods, and energy drinks are potentially modifiable risk factors for hypertension in a Mexico City cohort of young and relatively healthy adults. These findings underscore the importance of addressing these factors in hypertension prevention and management strategies.https://www.frontiersin.org/articles/10.3389/fcvm.2024.1434418/fullmachine learning modelshypertensionsleep disorderssedentary lifestylehigh-fat foods consumptionenergy drink consumption
spellingShingle Mireya Martínez-García
Guadalupe O. Gutiérrez-Esparza
Guadalupe O. Gutiérrez-Esparza
Manlio F. Márquez
Manlio F. Márquez
Luis M. Amezcua-Guerra
Enrique Hernández-Lemus
Enrique Hernández-Lemus
Machine learning analysis of emerging risk factors for early-onset hypertension in the Tlalpan 2020 cohort
Frontiers in Cardiovascular Medicine
machine learning models
hypertension
sleep disorders
sedentary lifestyle
high-fat foods consumption
energy drink consumption
title Machine learning analysis of emerging risk factors for early-onset hypertension in the Tlalpan 2020 cohort
title_full Machine learning analysis of emerging risk factors for early-onset hypertension in the Tlalpan 2020 cohort
title_fullStr Machine learning analysis of emerging risk factors for early-onset hypertension in the Tlalpan 2020 cohort
title_full_unstemmed Machine learning analysis of emerging risk factors for early-onset hypertension in the Tlalpan 2020 cohort
title_short Machine learning analysis of emerging risk factors for early-onset hypertension in the Tlalpan 2020 cohort
title_sort machine learning analysis of emerging risk factors for early onset hypertension in the tlalpan 2020 cohort
topic machine learning models
hypertension
sleep disorders
sedentary lifestyle
high-fat foods consumption
energy drink consumption
url https://www.frontiersin.org/articles/10.3389/fcvm.2024.1434418/full
work_keys_str_mv AT mireyamartinezgarcia machinelearninganalysisofemergingriskfactorsforearlyonsethypertensioninthetlalpan2020cohort
AT guadalupeogutierrezesparza machinelearninganalysisofemergingriskfactorsforearlyonsethypertensioninthetlalpan2020cohort
AT guadalupeogutierrezesparza machinelearninganalysisofemergingriskfactorsforearlyonsethypertensioninthetlalpan2020cohort
AT manliofmarquez machinelearninganalysisofemergingriskfactorsforearlyonsethypertensioninthetlalpan2020cohort
AT manliofmarquez machinelearninganalysisofemergingriskfactorsforearlyonsethypertensioninthetlalpan2020cohort
AT luismamezcuaguerra machinelearninganalysisofemergingriskfactorsforearlyonsethypertensioninthetlalpan2020cohort
AT enriquehernandezlemus machinelearninganalysisofemergingriskfactorsforearlyonsethypertensioninthetlalpan2020cohort
AT enriquehernandezlemus machinelearninganalysisofemergingriskfactorsforearlyonsethypertensioninthetlalpan2020cohort