Evaluation of Bayesian Linear Regression models for gene set prioritization in complex diseases.

Genome-wide association studies (GWAS) provide valuable insights into the genetic architecture of complex traits, yet interpreting their results remains challenging due to the polygenic nature of most traits. Gene set analysis offers a solution by aggregating genetic variants into biologically relev...

Full description

Saved in:
Bibliographic Details
Main Authors: Tahereh Gholipourshahraki, Zhonghao Bai, Merina Shrestha, Astrid Hjelholt, Sile Hu, Mads Kjolby, Palle Duun Rohde, Peter Sørensen
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-11-01
Series:PLoS Genetics
Online Access:https://doi.org/10.1371/journal.pgen.1011463
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846162654805098496
author Tahereh Gholipourshahraki
Zhonghao Bai
Merina Shrestha
Astrid Hjelholt
Sile Hu
Mads Kjolby
Palle Duun Rohde
Peter Sørensen
author_facet Tahereh Gholipourshahraki
Zhonghao Bai
Merina Shrestha
Astrid Hjelholt
Sile Hu
Mads Kjolby
Palle Duun Rohde
Peter Sørensen
author_sort Tahereh Gholipourshahraki
collection DOAJ
description Genome-wide association studies (GWAS) provide valuable insights into the genetic architecture of complex traits, yet interpreting their results remains challenging due to the polygenic nature of most traits. Gene set analysis offers a solution by aggregating genetic variants into biologically relevant pathways, enhancing the detection of coordinated effects across multiple genes. In this study, we present and evaluate a gene set prioritization approach utilizing Bayesian Linear Regression (BLR) models to uncover shared genetic components among different phenotypes and facilitate biological interpretation. Through extensive simulations and analyses of real traits, we demonstrate the efficacy of the BLR model in prioritizing pathways for complex traits. Simulation studies reveal insights into the model's performance under various scenarios, highlighting the impact of factors such as the number of causal genes, proportions of causal variants, heritability, and disease prevalence. Comparative analyses with MAGMA (Multi-marker Analysis of GenoMic Annotation) demonstrate BLR's superior performance, especially in highly overlapped gene sets. Application of both single-trait and multi-trait BLR models to real data, specifically GWAS summary data for type 2 diabetes (T2D) and related phenotypes, identifies significant associations with T2D-related pathways. Furthermore, comparison between single- and multi-trait BLR analyses highlights the superior performance of the multi-trait approach in identifying associated pathways, showcasing increased statistical power when analyzing multiple traits jointly. Additionally, enrichment analysis with integrated data from various public resources supports our results, confirming significant enrichment of diabetes-related genes within the top T2D pathways resulting from the multi-trait analysis. The BLR model's ability to handle diverse genomic features, perform regularization, conduct variable selection, and integrate information from multiple traits, genders, and ancestries demonstrates its utility in understanding the genetic architecture of complex traits. Our study provides insights into the potential of the BLR model to prioritize gene sets, offering a flexible framework applicable to various datasets. This model presents opportunities for advancing personalized medicine by exploring the genetic underpinnings of multifactorial traits.
format Article
id doaj-art-e377dec24a144c98a0fb775b561fcd7d
institution Kabale University
issn 1553-7390
1553-7404
language English
publishDate 2024-11-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Genetics
spelling doaj-art-e377dec24a144c98a0fb775b561fcd7d2024-11-20T05:30:59ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042024-11-012011e101146310.1371/journal.pgen.1011463Evaluation of Bayesian Linear Regression models for gene set prioritization in complex diseases.Tahereh GholipourshahrakiZhonghao BaiMerina ShresthaAstrid HjelholtSile HuMads KjolbyPalle Duun RohdePeter SørensenGenome-wide association studies (GWAS) provide valuable insights into the genetic architecture of complex traits, yet interpreting their results remains challenging due to the polygenic nature of most traits. Gene set analysis offers a solution by aggregating genetic variants into biologically relevant pathways, enhancing the detection of coordinated effects across multiple genes. In this study, we present and evaluate a gene set prioritization approach utilizing Bayesian Linear Regression (BLR) models to uncover shared genetic components among different phenotypes and facilitate biological interpretation. Through extensive simulations and analyses of real traits, we demonstrate the efficacy of the BLR model in prioritizing pathways for complex traits. Simulation studies reveal insights into the model's performance under various scenarios, highlighting the impact of factors such as the number of causal genes, proportions of causal variants, heritability, and disease prevalence. Comparative analyses with MAGMA (Multi-marker Analysis of GenoMic Annotation) demonstrate BLR's superior performance, especially in highly overlapped gene sets. Application of both single-trait and multi-trait BLR models to real data, specifically GWAS summary data for type 2 diabetes (T2D) and related phenotypes, identifies significant associations with T2D-related pathways. Furthermore, comparison between single- and multi-trait BLR analyses highlights the superior performance of the multi-trait approach in identifying associated pathways, showcasing increased statistical power when analyzing multiple traits jointly. Additionally, enrichment analysis with integrated data from various public resources supports our results, confirming significant enrichment of diabetes-related genes within the top T2D pathways resulting from the multi-trait analysis. The BLR model's ability to handle diverse genomic features, perform regularization, conduct variable selection, and integrate information from multiple traits, genders, and ancestries demonstrates its utility in understanding the genetic architecture of complex traits. Our study provides insights into the potential of the BLR model to prioritize gene sets, offering a flexible framework applicable to various datasets. This model presents opportunities for advancing personalized medicine by exploring the genetic underpinnings of multifactorial traits.https://doi.org/10.1371/journal.pgen.1011463
spellingShingle Tahereh Gholipourshahraki
Zhonghao Bai
Merina Shrestha
Astrid Hjelholt
Sile Hu
Mads Kjolby
Palle Duun Rohde
Peter Sørensen
Evaluation of Bayesian Linear Regression models for gene set prioritization in complex diseases.
PLoS Genetics
title Evaluation of Bayesian Linear Regression models for gene set prioritization in complex diseases.
title_full Evaluation of Bayesian Linear Regression models for gene set prioritization in complex diseases.
title_fullStr Evaluation of Bayesian Linear Regression models for gene set prioritization in complex diseases.
title_full_unstemmed Evaluation of Bayesian Linear Regression models for gene set prioritization in complex diseases.
title_short Evaluation of Bayesian Linear Regression models for gene set prioritization in complex diseases.
title_sort evaluation of bayesian linear regression models for gene set prioritization in complex diseases
url https://doi.org/10.1371/journal.pgen.1011463
work_keys_str_mv AT taherehgholipourshahraki evaluationofbayesianlinearregressionmodelsforgenesetprioritizationincomplexdiseases
AT zhonghaobai evaluationofbayesianlinearregressionmodelsforgenesetprioritizationincomplexdiseases
AT merinashrestha evaluationofbayesianlinearregressionmodelsforgenesetprioritizationincomplexdiseases
AT astridhjelholt evaluationofbayesianlinearregressionmodelsforgenesetprioritizationincomplexdiseases
AT silehu evaluationofbayesianlinearregressionmodelsforgenesetprioritizationincomplexdiseases
AT madskjolby evaluationofbayesianlinearregressionmodelsforgenesetprioritizationincomplexdiseases
AT palleduunrohde evaluationofbayesianlinearregressionmodelsforgenesetprioritizationincomplexdiseases
AT petersørensen evaluationofbayesianlinearregressionmodelsforgenesetprioritizationincomplexdiseases