Optimizing enzyme thermostability by combining multiple mutations using protein language model

Abstract Optimizing enzyme thermostability is essential for advancements in protein science and industrial applications. Currently, (semi‐)rational design and random mutagenesis methods can accurately identify single‐point mutations that enhance enzyme thermostability. However, complex epistatic int...

Full description

Saved in:
Bibliographic Details
Main Authors: Jiahao Bian, Pan Tan, Ting Nie, Liang Hong, Guang‐Yu Yang
Format: Article
Language:English
Published: Wiley 2024-12-01
Series:mLife
Subjects:
Online Access:https://doi.org/10.1002/mlf2.12151
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846099692787597312
author Jiahao Bian
Pan Tan
Ting Nie
Liang Hong
Guang‐Yu Yang
author_facet Jiahao Bian
Pan Tan
Ting Nie
Liang Hong
Guang‐Yu Yang
author_sort Jiahao Bian
collection DOAJ
description Abstract Optimizing enzyme thermostability is essential for advancements in protein science and industrial applications. Currently, (semi‐)rational design and random mutagenesis methods can accurately identify single‐point mutations that enhance enzyme thermostability. However, complex epistatic interactions often arise when multiple mutation sites are combined, leading to the complete inactivation of combinatorial mutants. As a result, constructing an optimized enzyme often requires repeated rounds of design to incrementally incorporate single mutation sites, which is highly time‐consuming. In this study, we developed an AI‐aided strategy for enzyme thermostability engineering that efficiently facilitates the recombination of beneficial single‐point mutations. We utilized thermostability data from creatinase, including 18 single‐point mutants, 22 double‐point mutants, 21 triple‐point mutants, and 12 quadruple‐point mutants. Using these data as inputs, we used a temperature‐guided protein language model, Pro‐PRIME, to learn epistatic features and design combinatorial mutants. After two rounds of design, we obtained 50 combinatorial mutants with superior thermostability, achieving a success rate of 100%. The best mutant, 13M4, contained 13 mutation sites and maintained nearly full catalytic activity compared to the wild‐type. It showed a 10.19°C increase in the melting temperature and an ~655‐fold increase in the half‐life at 58°C. Additionally, the model successfully captured epistasis in high‐order combinatorial mutants, including sign epistasis (K351E) and synergistic epistasis (D17V/I149V). We elucidated the mechanism of long‐range epistasis in detail using a dynamics cross‐correlation matrix method. Our work provides an efficient framework for designing enzyme thermostability and studying high‐order epistatic effects in protein‐directed evolution.
format Article
id doaj-art-f55870c1d8804c7c81bd6f82a9333744
institution Kabale University
issn 2770-100X
language English
publishDate 2024-12-01
publisher Wiley
record_format Article
series mLife
spelling doaj-art-f55870c1d8804c7c81bd6f82a93337442024-12-31T09:24:49ZengWileymLife2770-100X2024-12-013449250410.1002/mlf2.12151Optimizing enzyme thermostability by combining multiple mutations using protein language modelJiahao Bian0Pan Tan1Ting Nie2Liang Hong3Guang‐Yu Yang4State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology Shanghai Jiao Tong University Shanghai ChinaShanghai Artificial Intelligence Laboratory Shanghai ChinaState Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology Shanghai Jiao Tong University Shanghai ChinaShanghai Artificial Intelligence Laboratory Shanghai ChinaState Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology Shanghai Jiao Tong University Shanghai ChinaAbstract Optimizing enzyme thermostability is essential for advancements in protein science and industrial applications. Currently, (semi‐)rational design and random mutagenesis methods can accurately identify single‐point mutations that enhance enzyme thermostability. However, complex epistatic interactions often arise when multiple mutation sites are combined, leading to the complete inactivation of combinatorial mutants. As a result, constructing an optimized enzyme often requires repeated rounds of design to incrementally incorporate single mutation sites, which is highly time‐consuming. In this study, we developed an AI‐aided strategy for enzyme thermostability engineering that efficiently facilitates the recombination of beneficial single‐point mutations. We utilized thermostability data from creatinase, including 18 single‐point mutants, 22 double‐point mutants, 21 triple‐point mutants, and 12 quadruple‐point mutants. Using these data as inputs, we used a temperature‐guided protein language model, Pro‐PRIME, to learn epistatic features and design combinatorial mutants. After two rounds of design, we obtained 50 combinatorial mutants with superior thermostability, achieving a success rate of 100%. The best mutant, 13M4, contained 13 mutation sites and maintained nearly full catalytic activity compared to the wild‐type. It showed a 10.19°C increase in the melting temperature and an ~655‐fold increase in the half‐life at 58°C. Additionally, the model successfully captured epistasis in high‐order combinatorial mutants, including sign epistasis (K351E) and synergistic epistasis (D17V/I149V). We elucidated the mechanism of long‐range epistasis in detail using a dynamics cross‐correlation matrix method. Our work provides an efficient framework for designing enzyme thermostability and studying high‐order epistatic effects in protein‐directed evolution.https://doi.org/10.1002/mlf2.12151combinatorial mutantscreatinaseepistasisprotein language modelthermostability
spellingShingle Jiahao Bian
Pan Tan
Ting Nie
Liang Hong
Guang‐Yu Yang
Optimizing enzyme thermostability by combining multiple mutations using protein language model
mLife
combinatorial mutants
creatinase
epistasis
protein language model
thermostability
title Optimizing enzyme thermostability by combining multiple mutations using protein language model
title_full Optimizing enzyme thermostability by combining multiple mutations using protein language model
title_fullStr Optimizing enzyme thermostability by combining multiple mutations using protein language model
title_full_unstemmed Optimizing enzyme thermostability by combining multiple mutations using protein language model
title_short Optimizing enzyme thermostability by combining multiple mutations using protein language model
title_sort optimizing enzyme thermostability by combining multiple mutations using protein language model
topic combinatorial mutants
creatinase
epistasis
protein language model
thermostability
url https://doi.org/10.1002/mlf2.12151
work_keys_str_mv AT jiahaobian optimizingenzymethermostabilitybycombiningmultiplemutationsusingproteinlanguagemodel
AT pantan optimizingenzymethermostabilitybycombiningmultiplemutationsusingproteinlanguagemodel
AT tingnie optimizingenzymethermostabilitybycombiningmultiplemutationsusingproteinlanguagemodel
AT lianghong optimizingenzymethermostabilitybycombiningmultiplemutationsusingproteinlanguagemodel
AT guangyuyang optimizingenzymethermostabilitybycombiningmultiplemutationsusingproteinlanguagemodel