Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique

Summary: Bacteriophages (phages) are increasingly viewed as a promising alternative for the treatment of antibiotic-resistant bacterial infections. However, the diversity of host ranges complicates the identification of target phages. Existing computational tools often fail to accurately identify ph...

Full description

Saved in:
Bibliographic Details
Main Authors: Jie Pan, Rui Wang, Wenjing Liu, Li Wang, Zhuhong You, Yuechao Li, Zhemeng Duan, Qinghua Huang, Jie Feng, Yanmei Sun, Shiwei Wang
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:iScience
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2589004224028748
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841555668169916416
author Jie Pan
Rui Wang
Wenjing Liu
Li Wang
Zhuhong You
Yuechao Li
Zhemeng Duan
Qinghua Huang
Jie Feng
Yanmei Sun
Shiwei Wang
author_facet Jie Pan
Rui Wang
Wenjing Liu
Li Wang
Zhuhong You
Yuechao Li
Zhemeng Duan
Qinghua Huang
Jie Feng
Yanmei Sun
Shiwei Wang
author_sort Jie Pan
collection DOAJ
description Summary: Bacteriophages (phages) are increasingly viewed as a promising alternative for the treatment of antibiotic-resistant bacterial infections. However, the diversity of host ranges complicates the identification of target phages. Existing computational tools often fail to accurately identify phages across different bacterial species. In this study, we present GE-PHI, a machine-learning-based model for predicting phage-host interactions (PHIs) by integrating knowledge graph embedding algorithm with a large-scale protein language model. First, a phage-host heterogeneous association network (PHAN) was constructed that incorporated phage-phage and host-host similarity networks. Then, the multi-relational Poincaré graph embedding (MuRP) was used to extract topological patterns. Additionally, we employed the ESM-2 protein language model to capture evolutionary information from phage tail proteins and host-receptor-binding proteins. GE-PHI achieved a cross-validation area under the curve (AUC) of up to 0.9453 in silico and maintains this performance in case studies. This study provides insights into machine-learning-guided phage therapeutics and diagnostics in microbial engineering.
format Article
id doaj-art-445472c60f8344288e49e959da081263
institution Kabale University
issn 2589-0042
language English
publishDate 2025-01-01
publisher Elsevier
record_format Article
series iScience
spelling doaj-art-445472c60f8344288e49e959da0812632025-01-08T04:53:18ZengElsevieriScience2589-00422025-01-01281111647Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language techniqueJie Pan0Rui Wang1Wenjing Liu2Li Wang3Zhuhong You4Yuechao Li5Zhemeng Duan6Qinghua Huang7Jie Feng8Yanmei Sun9Shiwei Wang10Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, ChinaDepartment of Ophthalmology, The First Affiliated Hospital of Northwest University, 30 Fenxiang, the South Avenue, Xi’an, Shaanxi 710002, ChinaKey Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, ChinaKey Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, ChinaSchool of Computer Science, Northwestern Polytechnical University, Xi’an 710129, ChinaSchool of Computer Science, Northwestern Polytechnical University, Xi’an 710129, ChinaKey Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, ChinaSchool of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xi’an, Shaanxi 710072, ChinaState Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, ChinaKey Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, China; Corresponding authorKey Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi’an 710069, China; Corresponding authorSummary: Bacteriophages (phages) are increasingly viewed as a promising alternative for the treatment of antibiotic-resistant bacterial infections. However, the diversity of host ranges complicates the identification of target phages. Existing computational tools often fail to accurately identify phages across different bacterial species. In this study, we present GE-PHI, a machine-learning-based model for predicting phage-host interactions (PHIs) by integrating knowledge graph embedding algorithm with a large-scale protein language model. First, a phage-host heterogeneous association network (PHAN) was constructed that incorporated phage-phage and host-host similarity networks. Then, the multi-relational Poincaré graph embedding (MuRP) was used to extract topological patterns. Additionally, we employed the ESM-2 protein language model to capture evolutionary information from phage tail proteins and host-receptor-binding proteins. GE-PHI achieved a cross-validation area under the curve (AUC) of up to 0.9453 in silico and maintains this performance in case studies. This study provides insights into machine-learning-guided phage therapeutics and diagnostics in microbial engineering.http://www.sciencedirect.com/science/article/pii/S2589004224028748VirologyMicrobiologyBacteriologyMachine learning
spellingShingle Jie Pan
Rui Wang
Wenjing Liu
Li Wang
Zhuhong You
Yuechao Li
Zhemeng Duan
Qinghua Huang
Jie Feng
Yanmei Sun
Shiwei Wang
Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique
iScience
Virology
Microbiology
Bacteriology
Machine learning
title Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique
title_full Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique
title_fullStr Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique
title_full_unstemmed Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique
title_short Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique
title_sort predicting phage host interaction via hyperbolic poincare graph embedding and large scale protein language technique
topic Virology
Microbiology
Bacteriology
Machine learning
url http://www.sciencedirect.com/science/article/pii/S2589004224028748
work_keys_str_mv AT jiepan predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique
AT ruiwang predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique
AT wenjingliu predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique
AT liwang predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique
AT zhuhongyou predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique
AT yuechaoli predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique
AT zhemengduan predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique
AT qinghuahuang predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique
AT jiefeng predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique
AT yanmeisun predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique
AT shiweiwang predictingphagehostinteractionviahyperbolicpoincaregraphembeddingandlargescaleproteinlanguagetechnique