Improved detection of microbiome-disease associations via population structure-aware generalized linear mixed effects models (microSLAM).

Microbiome association studies typically link host disease or other traits to summary statistics measured in metagenomics data, such as diversity or taxonomic composition. But identifying disease-associated species based on their relative abundance does not provide insight into why these microbes ac...

Full description

Saved in:
Bibliographic Details
Main Authors: Miriam Goldman, Chunyu Zhao, Katherine S Pollard
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-05-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1012277
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Microbiome association studies typically link host disease or other traits to summary statistics measured in metagenomics data, such as diversity or taxonomic composition. But identifying disease-associated species based on their relative abundance does not provide insight into why these microbes act as disease markers, and it overlooks cases where disease risk is related to specific strains with unique biological functions. To bridge this knowledge gap, we developed microSLAM, a mixed-effects model and an R package that performs association tests that connect host traits to the presence/absence of genes within each microbiome species, while accounting for strain genetic relatedness across hosts. Traits can be quantitative or binary (such as case/control). MicroSLAM is fit in three steps for each species. The first step estimates population structure across hosts. Step two calculates the association between population structure and the trait, enabling detection of species for which a subset of related strains confer risk. To identify specific genes whose presence/absence across diverse strains is associated with the trait, step three models the trait as a function of gene occurrence plus random effects estimated from step two. Applying microSLAM to 710 gut metagenomes from inflammatory bowel disease (IBD) samples, we discovered 56 species whose population structure correlates with IBD, meaning that different lineages are found in cases versus controls. After controlling for population structure, 20 species had genes significantly associated with IBD. Twenty-one of these genes were more common in IBD patients, while 32 genes were enriched in healthy controls, including a seven-gene operon in Faecalibacterium prausnitzii that is involved in utilization of fructoselysine from the gut environment. The vast majority of species detected by microSLAM were not significantly associated with IBD using standard relative abundance tests. These findings highlight the importance of accounting for within-species genetic variation in microbiome studies.
ISSN:1553-734X
1553-7358