PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery

Abstract Prediction of protein-protein binding (PPB) affinity plays an important role in large-molecular drug discovery. Deep learning (DL) has been adopted to predict the changes of PPB binding affinities upon mutations, but there was a scarcity of studies predicting the PPB affinity itself. The ma...

Full description

Saved in:
Bibliographic Details
Main Authors: Huaqing Liu, Peiyi Chen, Xiaochen Zhai, Ku-Geng Huo, Shuxian Zhou, Lanqing Han, Guoxin Fan
Format: Article
Language:English
Published: Nature Portfolio 2024-12-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-024-03997-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846137289286090752
author Huaqing Liu
Peiyi Chen
Xiaochen Zhai
Ku-Geng Huo
Shuxian Zhou
Lanqing Han
Guoxin Fan
author_facet Huaqing Liu
Peiyi Chen
Xiaochen Zhai
Ku-Geng Huo
Shuxian Zhou
Lanqing Han
Guoxin Fan
author_sort Huaqing Liu
collection DOAJ
description Abstract Prediction of protein-protein binding (PPB) affinity plays an important role in large-molecular drug discovery. Deep learning (DL) has been adopted to predict the changes of PPB binding affinities upon mutations, but there was a scarcity of studies predicting the PPB affinity itself. The major reason is the paucity of open-source dataset with PPB affinity data. To address this gap, the current study introduced a large comprehensive PPB affinity (PPB-Affinity) dataset. The PPB-Affinity dataset contains key information such as crystal structures of protein-protein complexes (with or without protein mutation patterns), PPB affinity, receptor protein chain, ligand protein chain, etc. To the best of our knowledge, this is the largest publicly available PPB affinity dataset, and we believe it will significantly advance drug discovery by streamlining the screening of potential large-molecule drugs. We also developed a deep-learning benchmark model with this dataset to predict the PPB affinity, providing a foundational comparison for the research community.
format Article
id doaj-art-84b520fde9814b32a761fda2a63cf8db
institution Kabale University
issn 2052-4463
language English
publishDate 2024-12-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-84b520fde9814b32a761fda2a63cf8db2024-12-08T12:17:56ZengNature PortfolioScientific Data2052-44632024-12-0111111110.1038/s41597-024-03997-4PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discoveryHuaqing Liu0Peiyi Chen1Xiaochen Zhai2Ku-Geng Huo3Shuxian Zhou4Lanqing Han5Guoxin Fan6Artificial Intelligence Innovation Center, Research Institute of Tsinghua, Pearl River DeltaArtificial Intelligence Innovation Center, Research Institute of Tsinghua, Pearl River DeltaCyagen Biosciences (Suzhou) Inc.Cyagen Biosciences (Guangzhou) Inc.Artificial Intelligence Innovation Center, Research Institute of Tsinghua, Pearl River DeltaArtificial Intelligence Innovation Center, Research Institute of Tsinghua, Pearl River DeltaDepartment of Pain Medicine, Shenzhen Nanshan People’s Hospital, Shenzhen University Medical SchoolAbstract Prediction of protein-protein binding (PPB) affinity plays an important role in large-molecular drug discovery. Deep learning (DL) has been adopted to predict the changes of PPB binding affinities upon mutations, but there was a scarcity of studies predicting the PPB affinity itself. The major reason is the paucity of open-source dataset with PPB affinity data. To address this gap, the current study introduced a large comprehensive PPB affinity (PPB-Affinity) dataset. The PPB-Affinity dataset contains key information such as crystal structures of protein-protein complexes (with or without protein mutation patterns), PPB affinity, receptor protein chain, ligand protein chain, etc. To the best of our knowledge, this is the largest publicly available PPB affinity dataset, and we believe it will significantly advance drug discovery by streamlining the screening of potential large-molecule drugs. We also developed a deep-learning benchmark model with this dataset to predict the PPB affinity, providing a foundational comparison for the research community.https://doi.org/10.1038/s41597-024-03997-4
spellingShingle Huaqing Liu
Peiyi Chen
Xiaochen Zhai
Ku-Geng Huo
Shuxian Zhou
Lanqing Han
Guoxin Fan
PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery
Scientific Data
title PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery
title_full PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery
title_fullStr PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery
title_full_unstemmed PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery
title_short PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery
title_sort ppb affinity protein protein binding affinity dataset for ai based protein drug discovery
url https://doi.org/10.1038/s41597-024-03997-4
work_keys_str_mv AT huaqingliu ppbaffinityproteinproteinbindingaffinitydatasetforaibasedproteindrugdiscovery
AT peiyichen ppbaffinityproteinproteinbindingaffinitydatasetforaibasedproteindrugdiscovery
AT xiaochenzhai ppbaffinityproteinproteinbindingaffinitydatasetforaibasedproteindrugdiscovery
AT kugenghuo ppbaffinityproteinproteinbindingaffinitydatasetforaibasedproteindrugdiscovery
AT shuxianzhou ppbaffinityproteinproteinbindingaffinitydatasetforaibasedproteindrugdiscovery
AT lanqinghan ppbaffinityproteinproteinbindingaffinitydatasetforaibasedproteindrugdiscovery
AT guoxinfan ppbaffinityproteinproteinbindingaffinitydatasetforaibasedproteindrugdiscovery