Eco-Evolutionary Drivers of Vibrio parahaemolyticus Sequence Type 3 Expansion: Retrospective Machine Learning Approach

BackgroundEnvironmentally sensitive pathogens exhibit ecological and evolutionary responses to climate change that result in the emergence and global expansion of well-adapted variants. It is imperative to understand the mechanisms that facilitate pathogen emergence and expan...

Full description

Saved in:
Bibliographic Details
Main Authors: Amy Marie Campbell, Chris Hauton, Ronny van Aerle, Jaime Martinez-Urtaza
Format: Article
Language:English
Published: JMIR Publications 2024-11-01
Series:JMIR Bioinformatics and Biotechnology
Online Access:https://bioinform.jmir.org/2024/1/e62747
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846150487339958272
author Amy Marie Campbell
Chris Hauton
Ronny van Aerle
Jaime Martinez-Urtaza
author_facet Amy Marie Campbell
Chris Hauton
Ronny van Aerle
Jaime Martinez-Urtaza
author_sort Amy Marie Campbell
collection DOAJ
description BackgroundEnvironmentally sensitive pathogens exhibit ecological and evolutionary responses to climate change that result in the emergence and global expansion of well-adapted variants. It is imperative to understand the mechanisms that facilitate pathogen emergence and expansion, as well as the drivers behind the mechanisms, to understand and prepare for future pandemic expansions. ObjectiveThe unique, rapid, global expansion of a clonal complex of Vibrio parahaemolyticus (a marine bacterium causing gastroenteritis infections) named Vibrio parahaemolyticus sequence type 3 (VpST3) provides an opportunity to explore the eco-evolutionary drivers of pathogen expansion. MethodsThe global expansion of VpST3 was reconstructed using VpST3 genomes, which were then classified into metrics characterizing the stages of this expansion process, indicative of the stages of emergence and establishment. We used machine learning, specifically a random forest classifier, to test a range of ecological and evolutionary drivers for their potential in predicting VpST3 expansion dynamics. ResultsWe identified a range of evolutionary features, including mutations in the core genome and accessory gene presence, associated with expansion dynamics. A range of random forest classifier approaches were tested to predict expansion classification metrics for each genome. The highest predictive accuracies (ranging from 0.722 to 0.967) were achieved for models using a combined eco-evolutionary approach. While population structure and the difference between introduced and established isolates could be predicted to a high accuracy, our model reported multiple false positives when predicting the success of an introduced isolate, suggesting potential limiting factors not represented in our eco-evolutionary features. Regional models produced for 2 countries reporting the most VpST3 genomes had varying success, reflecting the impacts of class imbalance. ConclusionsThese novel insights into evolutionary features and ecological conditions related to the stages of VpST3 expansion showcase the potential of machine learning models using genomic data and will contribute to the future understanding of the eco-evolutionary pathways of climate-sensitive pathogens.
format Article
id doaj-art-bd15d37380b74f1e8c2aeb0ca1fa1f22
institution Kabale University
issn 2563-3570
language English
publishDate 2024-11-01
publisher JMIR Publications
record_format Article
series JMIR Bioinformatics and Biotechnology
spelling doaj-art-bd15d37380b74f1e8c2aeb0ca1fa1f222024-11-28T18:45:33ZengJMIR PublicationsJMIR Bioinformatics and Biotechnology2563-35702024-11-015e6274710.2196/62747Eco-Evolutionary Drivers of Vibrio parahaemolyticus Sequence Type 3 Expansion: Retrospective Machine Learning ApproachAmy Marie Campbellhttps://orcid.org/0000-0003-4111-8286Chris Hautonhttps://orcid.org/0000-0002-2313-4226Ronny van Aerlehttps://orcid.org/0000-0002-2605-6518Jaime Martinez-Urtazahttps://orcid.org/0000-0001-6219-0418 BackgroundEnvironmentally sensitive pathogens exhibit ecological and evolutionary responses to climate change that result in the emergence and global expansion of well-adapted variants. It is imperative to understand the mechanisms that facilitate pathogen emergence and expansion, as well as the drivers behind the mechanisms, to understand and prepare for future pandemic expansions. ObjectiveThe unique, rapid, global expansion of a clonal complex of Vibrio parahaemolyticus (a marine bacterium causing gastroenteritis infections) named Vibrio parahaemolyticus sequence type 3 (VpST3) provides an opportunity to explore the eco-evolutionary drivers of pathogen expansion. MethodsThe global expansion of VpST3 was reconstructed using VpST3 genomes, which were then classified into metrics characterizing the stages of this expansion process, indicative of the stages of emergence and establishment. We used machine learning, specifically a random forest classifier, to test a range of ecological and evolutionary drivers for their potential in predicting VpST3 expansion dynamics. ResultsWe identified a range of evolutionary features, including mutations in the core genome and accessory gene presence, associated with expansion dynamics. A range of random forest classifier approaches were tested to predict expansion classification metrics for each genome. The highest predictive accuracies (ranging from 0.722 to 0.967) were achieved for models using a combined eco-evolutionary approach. While population structure and the difference between introduced and established isolates could be predicted to a high accuracy, our model reported multiple false positives when predicting the success of an introduced isolate, suggesting potential limiting factors not represented in our eco-evolutionary features. Regional models produced for 2 countries reporting the most VpST3 genomes had varying success, reflecting the impacts of class imbalance. ConclusionsThese novel insights into evolutionary features and ecological conditions related to the stages of VpST3 expansion showcase the potential of machine learning models using genomic data and will contribute to the future understanding of the eco-evolutionary pathways of climate-sensitive pathogens.https://bioinform.jmir.org/2024/1/e62747
spellingShingle Amy Marie Campbell
Chris Hauton
Ronny van Aerle
Jaime Martinez-Urtaza
Eco-Evolutionary Drivers of Vibrio parahaemolyticus Sequence Type 3 Expansion: Retrospective Machine Learning Approach
JMIR Bioinformatics and Biotechnology
title Eco-Evolutionary Drivers of Vibrio parahaemolyticus Sequence Type 3 Expansion: Retrospective Machine Learning Approach
title_full Eco-Evolutionary Drivers of Vibrio parahaemolyticus Sequence Type 3 Expansion: Retrospective Machine Learning Approach
title_fullStr Eco-Evolutionary Drivers of Vibrio parahaemolyticus Sequence Type 3 Expansion: Retrospective Machine Learning Approach
title_full_unstemmed Eco-Evolutionary Drivers of Vibrio parahaemolyticus Sequence Type 3 Expansion: Retrospective Machine Learning Approach
title_short Eco-Evolutionary Drivers of Vibrio parahaemolyticus Sequence Type 3 Expansion: Retrospective Machine Learning Approach
title_sort eco evolutionary drivers of vibrio parahaemolyticus sequence type 3 expansion retrospective machine learning approach
url https://bioinform.jmir.org/2024/1/e62747
work_keys_str_mv AT amymariecampbell ecoevolutionarydriversofvibrioparahaemolyticussequencetype3expansionretrospectivemachinelearningapproach
AT chrishauton ecoevolutionarydriversofvibrioparahaemolyticussequencetype3expansionretrospectivemachinelearningapproach
AT ronnyvanaerle ecoevolutionarydriversofvibrioparahaemolyticussequencetype3expansionretrospectivemachinelearningapproach
AT jaimemartinezurtaza ecoevolutionarydriversofvibrioparahaemolyticussequencetype3expansionretrospectivemachinelearningapproach