Machine learning proteochemometric models for Cereblon glue activity predictions

Targeted protein degradation (TPD) is a rapidly developing drug discovery technique with unique efficacy and target scope stemming from its degradation-based activity. Molecular glue degraders are a promising arm of TPD, as evidenced by the FDA-approved therapeutics within this class, the increasing...

Full description

Saved in:
Bibliographic Details
Main Authors: Francis J. Prael, III, Jiayi Cox, Noé Sturm, Peter Kutchukian, William C. Forrester, Gregory Michaud, Jutta Blank, Lingling Shen, Raquel Rodríguez-Pérez
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:Artificial Intelligence in the Life Sciences
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667318524000072
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846123259045609472
author Francis J. Prael, III
Jiayi Cox
Noé Sturm
Peter Kutchukian
William C. Forrester
Gregory Michaud
Jutta Blank
Lingling Shen
Raquel Rodríguez-Pérez
author_facet Francis J. Prael, III
Jiayi Cox
Noé Sturm
Peter Kutchukian
William C. Forrester
Gregory Michaud
Jutta Blank
Lingling Shen
Raquel Rodríguez-Pérez
author_sort Francis J. Prael, III
collection DOAJ
description Targeted protein degradation (TPD) is a rapidly developing drug discovery technique with unique efficacy and target scope stemming from its degradation-based activity. Molecular glue degraders are a promising arm of TPD, as evidenced by the FDA-approved therapeutics within this class, the increasing number of degraders in clinical development, and their predisposition to drug-likeness. Cereblon (CRBN) glue degraders mediate target degradation by generating a neomorphic interface between CRBN and a protein of interest. While promising, the complicated nature of this CRBN-glue-target ternary complex makes the rational design of molecular glue degraders challenging. For other drug modalities, predictive modeling has been established to leverage existing activity data and generate quantitative structure-activity relationships (QSAR). However, the applicability of QSAR strategies for glues remains under-investigated. Herein, machine learning methodologies were developed to predict glue-mediated recruitment of CRBN to target proteins and achieved promising performance. Generated models leveraged more than a hundred internal screening campaigns across thousands of CRBN glues to predict glue-mediated recruitment of targets to CRBN. Our results show that recruitment activity of CRBN glue degraders can be modeled by machine learning, with 89 % of models producing an area under the receiver operating characteristic curve (ROC AUC) > 0.8 and 70 % of models producing a Matthew's correlation coefficient (MCC) > 0.2 for these primary screening data. Importantly, our findings also indicate that the combination of compound and protein descriptors in the so-called proteochemometric models improves performance, with >80 % of the models exhibiting higher ROC AUC and MCC values than per-target models only based on compound information. Hence, our investigations suggest that proteochemometric modeling is a successful approach for molecular glue degraders. The proposed machine learning strategies can aid compound prioritization based on recruitment efficacy and target selectivity, thus have the potential to facilitate the design and discovery of therapeutic CRBN molecular glues.
format Article
id doaj-art-55a983e8b68c47b29a24a8f5a592f2c3
institution Kabale University
issn 2667-3185
language English
publishDate 2024-12-01
publisher Elsevier
record_format Article
series Artificial Intelligence in the Life Sciences
spelling doaj-art-55a983e8b68c47b29a24a8f5a592f2c32024-12-14T06:33:53ZengElsevierArtificial Intelligence in the Life Sciences2667-31852024-12-016100100Machine learning proteochemometric models for Cereblon glue activity predictionsFrancis J. Prael, III0Jiayi Cox1Noé Sturm2Peter Kutchukian3William C. Forrester4Gregory Michaud5Jutta Blank6Lingling Shen7Raquel Rodríguez-Pérez8Novartis Biomedical Research, Novartis Campus, 02139 Cambridge MA, USANovartis Biomedical Research, Novartis Campus, 02139 Cambridge MA, USANovartis Biomedical Research, Novartis Campus, 4002 Basel, SwitzerlandNovartis Biomedical Research, Novartis Campus, 02139 Cambridge MA, USANovartis Biomedical Research, Novartis Campus, 02139 Cambridge MA, USANovartis Biomedical Research, Novartis Campus, 02139 Cambridge MA, USANovartis Biomedical Research, Novartis Campus, 02139 Cambridge MA, USANovartis Biomedical Research, Novartis Campus, 02139 Cambridge MA, USA; Corresponding authors.Novartis Biomedical Research, Novartis Campus, 4002 Basel, Switzerland; Corresponding authors.Targeted protein degradation (TPD) is a rapidly developing drug discovery technique with unique efficacy and target scope stemming from its degradation-based activity. Molecular glue degraders are a promising arm of TPD, as evidenced by the FDA-approved therapeutics within this class, the increasing number of degraders in clinical development, and their predisposition to drug-likeness. Cereblon (CRBN) glue degraders mediate target degradation by generating a neomorphic interface between CRBN and a protein of interest. While promising, the complicated nature of this CRBN-glue-target ternary complex makes the rational design of molecular glue degraders challenging. For other drug modalities, predictive modeling has been established to leverage existing activity data and generate quantitative structure-activity relationships (QSAR). However, the applicability of QSAR strategies for glues remains under-investigated. Herein, machine learning methodologies were developed to predict glue-mediated recruitment of CRBN to target proteins and achieved promising performance. Generated models leveraged more than a hundred internal screening campaigns across thousands of CRBN glues to predict glue-mediated recruitment of targets to CRBN. Our results show that recruitment activity of CRBN glue degraders can be modeled by machine learning, with 89 % of models producing an area under the receiver operating characteristic curve (ROC AUC) > 0.8 and 70 % of models producing a Matthew's correlation coefficient (MCC) > 0.2 for these primary screening data. Importantly, our findings also indicate that the combination of compound and protein descriptors in the so-called proteochemometric models improves performance, with >80 % of the models exhibiting higher ROC AUC and MCC values than per-target models only based on compound information. Hence, our investigations suggest that proteochemometric modeling is a successful approach for molecular glue degraders. The proposed machine learning strategies can aid compound prioritization based on recruitment efficacy and target selectivity, thus have the potential to facilitate the design and discovery of therapeutic CRBN molecular glues.http://www.sciencedirect.com/science/article/pii/S2667318524000072Machine learningGluesTargeted protein degradationCereblonProteochemometric modelsChemogenomics
spellingShingle Francis J. Prael, III
Jiayi Cox
Noé Sturm
Peter Kutchukian
William C. Forrester
Gregory Michaud
Jutta Blank
Lingling Shen
Raquel Rodríguez-Pérez
Machine learning proteochemometric models for Cereblon glue activity predictions
Artificial Intelligence in the Life Sciences
Machine learning
Glues
Targeted protein degradation
Cereblon
Proteochemometric models
Chemogenomics
title Machine learning proteochemometric models for Cereblon glue activity predictions
title_full Machine learning proteochemometric models for Cereblon glue activity predictions
title_fullStr Machine learning proteochemometric models for Cereblon glue activity predictions
title_full_unstemmed Machine learning proteochemometric models for Cereblon glue activity predictions
title_short Machine learning proteochemometric models for Cereblon glue activity predictions
title_sort machine learning proteochemometric models for cereblon glue activity predictions
topic Machine learning
Glues
Targeted protein degradation
Cereblon
Proteochemometric models
Chemogenomics
url http://www.sciencedirect.com/science/article/pii/S2667318524000072
work_keys_str_mv AT francisjpraeliii machinelearningproteochemometricmodelsforcereblonglueactivitypredictions
AT jiayicox machinelearningproteochemometricmodelsforcereblonglueactivitypredictions
AT noesturm machinelearningproteochemometricmodelsforcereblonglueactivitypredictions
AT peterkutchukian machinelearningproteochemometricmodelsforcereblonglueactivitypredictions
AT williamcforrester machinelearningproteochemometricmodelsforcereblonglueactivitypredictions
AT gregorymichaud machinelearningproteochemometricmodelsforcereblonglueactivitypredictions
AT juttablank machinelearningproteochemometricmodelsforcereblonglueactivitypredictions
AT linglingshen machinelearningproteochemometricmodelsforcereblonglueactivitypredictions
AT raquelrodriguezperez machinelearningproteochemometricmodelsforcereblonglueactivitypredictions