Multimodal machine learning for analysing multifactorial causes of disease—The case of childhood overweight and obesity in Mexico

BackgroundMexico has one of the highest global incidences of paediatric overweight and obesity. Public health interventions have shown only moderate success, possibly from relying on knowledge extracted using limited types of statistical data analysis methods.PurposeTo explore if multimodal machine...

Full description

Saved in:
Bibliographic Details
Main Authors: Rosario Silva Sepulveda, Magnus Boman
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Public Health
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpubh.2024.1369041/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841556748065832960
author Rosario Silva Sepulveda
Magnus Boman
Magnus Boman
author_facet Rosario Silva Sepulveda
Magnus Boman
Magnus Boman
author_sort Rosario Silva Sepulveda
collection DOAJ
description BackgroundMexico has one of the highest global incidences of paediatric overweight and obesity. Public health interventions have shown only moderate success, possibly from relying on knowledge extracted using limited types of statistical data analysis methods.PurposeTo explore if multimodal machine learning can enhance identifying predictive features from obesogenic environments and investigating complex disease or social patterns, using the Mexican National Health and Nutrition Survey.MethodsWe grouped features into five data modalities corresponding to paediatric population exogenous factors, in two multimodal machine learning pipelines, against a unimodal early fusion baseline. The supervised pipeline employed four methods: Linear classifier with Elastic Net regularisation, k-Nearest Neighbour, Decision Tree, and Random Forest. The unsupervised pipeline used traditional methods with k-Means and hierarchical clustering, with the optimal number of clusters calculated to be k = 2.ResultsThe decision tree classifier in the supervised early fusion approach produced the best quantitative results. The top five most important features for classifying child or adolescent health were measures of an adult in the household, selected at random: BMI, obesity diagnosis, being single, seeking care at private healthcare, and having paid TV in the home. Unsupervised learning approaches varied in the optimal number of clusters but agreed on the importance of home environment features when analysing inter-cluster patterns. Main findings from this study differed from previous studies using only traditional statistical methods on the same database. Notably, the BMI of a randomised adult within the household emerged as the most important feature, rather than maternal BMI, as reported in previous literature where unwanted cultural bias went undetected.ConclusionOur general conclusion is that multimodal machine learning is a promising approach for comprehensively analysing obesogenic environments. The modalities allowed for a multimodal approach designed to critically analyse data signal strength and reveal sources of unwanted bias. In particular, it may aid in developing more effective public health policies to address the ongoing paediatric obesity epidemic in Mexico.
format Article
id doaj-art-c9a2c68e9637423a8048402061070f12
institution Kabale University
issn 2296-2565
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Public Health
spelling doaj-art-c9a2c68e9637423a8048402061070f122025-01-07T06:40:32ZengFrontiers Media S.A.Frontiers in Public Health2296-25652025-01-011210.3389/fpubh.2024.13690411369041Multimodal machine learning for analysing multifactorial causes of disease—The case of childhood overweight and obesity in MexicoRosario Silva Sepulveda0Magnus Boman1Magnus Boman2Karolinska Institutet, Department of Medicine Solna, Division of Clinical Epidemiology, Stockholm, SwedenKarolinska Institutet, Department of Medicine Solna, Division of Clinical Epidemiology, Stockholm, SwedenMedTechLabs, BioClinicum, Karolinska University Hospital, Stockholm, SwedenBackgroundMexico has one of the highest global incidences of paediatric overweight and obesity. Public health interventions have shown only moderate success, possibly from relying on knowledge extracted using limited types of statistical data analysis methods.PurposeTo explore if multimodal machine learning can enhance identifying predictive features from obesogenic environments and investigating complex disease or social patterns, using the Mexican National Health and Nutrition Survey.MethodsWe grouped features into five data modalities corresponding to paediatric population exogenous factors, in two multimodal machine learning pipelines, against a unimodal early fusion baseline. The supervised pipeline employed four methods: Linear classifier with Elastic Net regularisation, k-Nearest Neighbour, Decision Tree, and Random Forest. The unsupervised pipeline used traditional methods with k-Means and hierarchical clustering, with the optimal number of clusters calculated to be k = 2.ResultsThe decision tree classifier in the supervised early fusion approach produced the best quantitative results. The top five most important features for classifying child or adolescent health were measures of an adult in the household, selected at random: BMI, obesity diagnosis, being single, seeking care at private healthcare, and having paid TV in the home. Unsupervised learning approaches varied in the optimal number of clusters but agreed on the importance of home environment features when analysing inter-cluster patterns. Main findings from this study differed from previous studies using only traditional statistical methods on the same database. Notably, the BMI of a randomised adult within the household emerged as the most important feature, rather than maternal BMI, as reported in previous literature where unwanted cultural bias went undetected.ConclusionOur general conclusion is that multimodal machine learning is a promising approach for comprehensively analysing obesogenic environments. The modalities allowed for a multimodal approach designed to critically analyse data signal strength and reveal sources of unwanted bias. In particular, it may aid in developing more effective public health policies to address the ongoing paediatric obesity epidemic in Mexico.https://www.frontiersin.org/articles/10.3389/fpubh.2024.1369041/fullsupervised machine learningunsupervised machine learningmultimodal machine learningbiaspaediatric obesityobesogenic environment
spellingShingle Rosario Silva Sepulveda
Magnus Boman
Magnus Boman
Multimodal machine learning for analysing multifactorial causes of disease—The case of childhood overweight and obesity in Mexico
Frontiers in Public Health
supervised machine learning
unsupervised machine learning
multimodal machine learning
bias
paediatric obesity
obesogenic environment
title Multimodal machine learning for analysing multifactorial causes of disease—The case of childhood overweight and obesity in Mexico
title_full Multimodal machine learning for analysing multifactorial causes of disease—The case of childhood overweight and obesity in Mexico
title_fullStr Multimodal machine learning for analysing multifactorial causes of disease—The case of childhood overweight and obesity in Mexico
title_full_unstemmed Multimodal machine learning for analysing multifactorial causes of disease—The case of childhood overweight and obesity in Mexico
title_short Multimodal machine learning for analysing multifactorial causes of disease—The case of childhood overweight and obesity in Mexico
title_sort multimodal machine learning for analysing multifactorial causes of disease the case of childhood overweight and obesity in mexico
topic supervised machine learning
unsupervised machine learning
multimodal machine learning
bias
paediatric obesity
obesogenic environment
url https://www.frontiersin.org/articles/10.3389/fpubh.2024.1369041/full
work_keys_str_mv AT rosariosilvasepulveda multimodalmachinelearningforanalysingmultifactorialcausesofdiseasethecaseofchildhoodoverweightandobesityinmexico
AT magnusboman multimodalmachinelearningforanalysingmultifactorialcausesofdiseasethecaseofchildhoodoverweightandobesityinmexico
AT magnusboman multimodalmachinelearningforanalysingmultifactorialcausesofdiseasethecaseofchildhoodoverweightandobesityinmexico