Block selection in multiblock partial least squares for modeling genotype-phenotype relations in Saccharomyces.
In data-based modeling, correlations between explanatory variables often lead to the formation of distinct gene blocks. This study focuses on identifying influential gene blocks and key variables within these blocks, with a particular application in mind: genotype-phenotype mapping in Saccharomyces....
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2025-01-01
|
Series: | PLoS ONE |
Online Access: | https://doi.org/10.1371/journal.pone.0316350 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841555594960437248 |
---|---|
author | Muhammad Tahir Bu Yude Tahir Mehmood Saima Bashir Zeeshan Ashraf |
author_facet | Muhammad Tahir Bu Yude Tahir Mehmood Saima Bashir Zeeshan Ashraf |
author_sort | Muhammad Tahir |
collection | DOAJ |
description | In data-based modeling, correlations between explanatory variables often lead to the formation of distinct gene blocks. This study focuses on identifying influential gene blocks and key variables within these blocks, with a particular application in mind: genotype-phenotype mapping in Saccharomyces. To overcome the challenges of a limited sample size, we use partial least squares (PLS). These gene blocks, which consist of combinations of genes, play a critical role in explaining phenotypic variations. Using partial least squares with multiple blocks, we propose a novel approach, weighted block importance on projection in partial least squares (BwIP-mbPLS), to identify influential gene blocks. Variable importance on projection is used to select significant genes within these blocks. Our study models copper chloride at 0.375mM and melibiose at 2% efficiency and rate in Saccharomyces cerevisiae yeast. Analysis based on silhouette index and total distance within clusters using k-means shows the classification of 5629 genes into 18 gene blocks. Remarkably, BwIP-mbPLS identifies 4 gene blocks on average and significantly improves the prediction of efficiency-based phenotypes. In contrast, traditional block importance in partial least squares projection identifies 6 gene blocks on average and shows comparable or better performance than BIP-mbPLS for rate-based phenotypes. Remarkably, most gene blocks contain fewer than 10 influential genes. Both proposed variants consistently outperform conventional approaches such as partial least squares and multi-block partial least squares in predicting phenotypes. These results highlight the potential of our methods for advancing data-based modeling and genotype-phenotype mapping. |
format | Article |
id | doaj-art-45fd376b7fec4528973c8813272e7236 |
institution | Kabale University |
issn | 1932-6203 |
language | English |
publishDate | 2025-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj-art-45fd376b7fec4528973c8813272e72362025-01-08T05:31:58ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01201e031635010.1371/journal.pone.0316350Block selection in multiblock partial least squares for modeling genotype-phenotype relations in Saccharomyces.Muhammad TahirBu YudeTahir MehmoodSaima BashirZeeshan AshrafIn data-based modeling, correlations between explanatory variables often lead to the formation of distinct gene blocks. This study focuses on identifying influential gene blocks and key variables within these blocks, with a particular application in mind: genotype-phenotype mapping in Saccharomyces. To overcome the challenges of a limited sample size, we use partial least squares (PLS). These gene blocks, which consist of combinations of genes, play a critical role in explaining phenotypic variations. Using partial least squares with multiple blocks, we propose a novel approach, weighted block importance on projection in partial least squares (BwIP-mbPLS), to identify influential gene blocks. Variable importance on projection is used to select significant genes within these blocks. Our study models copper chloride at 0.375mM and melibiose at 2% efficiency and rate in Saccharomyces cerevisiae yeast. Analysis based on silhouette index and total distance within clusters using k-means shows the classification of 5629 genes into 18 gene blocks. Remarkably, BwIP-mbPLS identifies 4 gene blocks on average and significantly improves the prediction of efficiency-based phenotypes. In contrast, traditional block importance in partial least squares projection identifies 6 gene blocks on average and shows comparable or better performance than BIP-mbPLS for rate-based phenotypes. Remarkably, most gene blocks contain fewer than 10 influential genes. Both proposed variants consistently outperform conventional approaches such as partial least squares and multi-block partial least squares in predicting phenotypes. These results highlight the potential of our methods for advancing data-based modeling and genotype-phenotype mapping.https://doi.org/10.1371/journal.pone.0316350 |
spellingShingle | Muhammad Tahir Bu Yude Tahir Mehmood Saima Bashir Zeeshan Ashraf Block selection in multiblock partial least squares for modeling genotype-phenotype relations in Saccharomyces. PLoS ONE |
title | Block selection in multiblock partial least squares for modeling genotype-phenotype relations in Saccharomyces. |
title_full | Block selection in multiblock partial least squares for modeling genotype-phenotype relations in Saccharomyces. |
title_fullStr | Block selection in multiblock partial least squares for modeling genotype-phenotype relations in Saccharomyces. |
title_full_unstemmed | Block selection in multiblock partial least squares for modeling genotype-phenotype relations in Saccharomyces. |
title_short | Block selection in multiblock partial least squares for modeling genotype-phenotype relations in Saccharomyces. |
title_sort | block selection in multiblock partial least squares for modeling genotype phenotype relations in saccharomyces |
url | https://doi.org/10.1371/journal.pone.0316350 |
work_keys_str_mv | AT muhammadtahir blockselectioninmultiblockpartialleastsquaresformodelinggenotypephenotyperelationsinsaccharomyces AT buyude blockselectioninmultiblockpartialleastsquaresformodelinggenotypephenotyperelationsinsaccharomyces AT tahirmehmood blockselectioninmultiblockpartialleastsquaresformodelinggenotypephenotyperelationsinsaccharomyces AT saimabashir blockselectioninmultiblockpartialleastsquaresformodelinggenotypephenotyperelationsinsaccharomyces AT zeeshanashraf blockselectioninmultiblockpartialleastsquaresformodelinggenotypephenotyperelationsinsaccharomyces |