DeepPhylo: Phylogeny‐Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data Analysis

Abstract Microbial data analysis poses significant challenges due to its high dimensionality, sparsity, and compositionality. Recent advances have shown that integrating abundance and phylogenetic information is an effective strategy for uncovering robust patterns and enhancing the predictive perfor...

Full description

Saved in:
Bibliographic Details
Main Authors: Bin Wang, Yulong Shen, Jingyan Fang, Xiaoquan Su, Zhenjiang Zech Xu
Format: Article
Language:English
Published: Wiley 2024-12-01
Series:Advanced Science
Subjects:
Online Access:https://doi.org/10.1002/advs.202404277
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846141384202911744
author Bin Wang
Yulong Shen
Jingyan Fang
Xiaoquan Su
Zhenjiang Zech Xu
author_facet Bin Wang
Yulong Shen
Jingyan Fang
Xiaoquan Su
Zhenjiang Zech Xu
author_sort Bin Wang
collection DOAJ
description Abstract Microbial data analysis poses significant challenges due to its high dimensionality, sparsity, and compositionality. Recent advances have shown that integrating abundance and phylogenetic information is an effective strategy for uncovering robust patterns and enhancing the predictive performance in microbiome studies. However, existing methods primarily focus on the hierarchical structure of phylogenetic trees, overlooking the evolutionary distances embedded within them. This study introduces DeepPhylo, a novel method that employs phylogeny‐aware amplicon embeddings to effectively integrate abundance and phylogenetic information. DeepPhylo improves both the unsupervised discriminatory power and supervised predictive accuracy of microbiome data analysis. Compared to the existing methods, DeepPhylo demonstrates superiority in informing biologically relevant insights across five real‐world microbiome use cases, including clustering of skin microbiomes, prediction of host chronological age and gender, diagnosis of inflammatory bowel disease (IBD) across 15 studies, and multilabel disease classification.
format Article
id doaj-art-d6132095d19d4991bb55d7dfb1986f96
institution Kabale University
issn 2198-3844
language English
publishDate 2024-12-01
publisher Wiley
record_format Article
series Advanced Science
spelling doaj-art-d6132095d19d4991bb55d7dfb1986f962024-12-04T12:14:55ZengWileyAdvanced Science2198-38442024-12-011145n/an/a10.1002/advs.202404277DeepPhylo: Phylogeny‐Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data AnalysisBin Wang0Yulong Shen1Jingyan Fang2Xiaoquan Su3Zhenjiang Zech Xu4School of Mathematics and Computer Sciences Nanchang University Nanchang 330031 ChinaSchool of Information Engineering Nanchang University Nanchang 330031 ChinaSchool of Mathematics and Computer Sciences Nanchang University Nanchang 330031 ChinaCollege of Computer Science and Technology Qingdao University Qingdao 266071 ChinaSchool of Mathematics and Computer Sciences Nanchang University Nanchang 330031 ChinaAbstract Microbial data analysis poses significant challenges due to its high dimensionality, sparsity, and compositionality. Recent advances have shown that integrating abundance and phylogenetic information is an effective strategy for uncovering robust patterns and enhancing the predictive performance in microbiome studies. However, existing methods primarily focus on the hierarchical structure of phylogenetic trees, overlooking the evolutionary distances embedded within them. This study introduces DeepPhylo, a novel method that employs phylogeny‐aware amplicon embeddings to effectively integrate abundance and phylogenetic information. DeepPhylo improves both the unsupervised discriminatory power and supervised predictive accuracy of microbiome data analysis. Compared to the existing methods, DeepPhylo demonstrates superiority in informing biologically relevant insights across five real‐world microbiome use cases, including clustering of skin microbiomes, prediction of host chronological age and gender, diagnosis of inflammatory bowel disease (IBD) across 15 studies, and multilabel disease classification.https://doi.org/10.1002/advs.202404277beta‐diversitydeep learningmicrobiomephylogeny
spellingShingle Bin Wang
Yulong Shen
Jingyan Fang
Xiaoquan Su
Zhenjiang Zech Xu
DeepPhylo: Phylogeny‐Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data Analysis
Advanced Science
beta‐diversity
deep learning
microbiome
phylogeny
title DeepPhylo: Phylogeny‐Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data Analysis
title_full DeepPhylo: Phylogeny‐Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data Analysis
title_fullStr DeepPhylo: Phylogeny‐Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data Analysis
title_full_unstemmed DeepPhylo: Phylogeny‐Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data Analysis
title_short DeepPhylo: Phylogeny‐Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data Analysis
title_sort deepphylo phylogeny aware microbial embeddings enhanced predictive accuracy in human microbiome data analysis
topic beta‐diversity
deep learning
microbiome
phylogeny
url https://doi.org/10.1002/advs.202404277
work_keys_str_mv AT binwang deepphylophylogenyawaremicrobialembeddingsenhancedpredictiveaccuracyinhumanmicrobiomedataanalysis
AT yulongshen deepphylophylogenyawaremicrobialembeddingsenhancedpredictiveaccuracyinhumanmicrobiomedataanalysis
AT jingyanfang deepphylophylogenyawaremicrobialembeddingsenhancedpredictiveaccuracyinhumanmicrobiomedataanalysis
AT xiaoquansu deepphylophylogenyawaremicrobialembeddingsenhancedpredictiveaccuracyinhumanmicrobiomedataanalysis
AT zhenjiangzechxu deepphylophylogenyawaremicrobialembeddingsenhancedpredictiveaccuracyinhumanmicrobiomedataanalysis